diff mbox

[RFA,2/2,master,only] gdb/windows-nat.c: Get rid of main_thread_id global

Message ID 1555453982-77808-3-git-send-email-brobecker@adacore.com
State New
Headers show

Commit Message

Joel Brobecker April 16, 2019, 10:33 p.m. UTC
Hello,

This global is meant to point to the "main" thread of execution of
the program we are debugging. It is set when attaching to a process
or when receiving a CREATE_PROCESS_DEBUG_EVENT event. The theory at
the time was that this was also going to be the thread receiving
the EXIT_PROCESS_DEBUG_EVENT event.

Unfortunately, we have discovered since then that this is actually
not guaranteed. What this means in practice is that there is moderate
risk that main_thread_id refers to a thread which no longer exists.

This global is used in 3 situations:
  - OUTPUT_DEBUG_STRING_EVENT
  - LOAD_DLL_DEBUG_EVENT
  - UNLOAD_DLL_DEBUG_EVENT

It's not clear why we would need to use the main_thread_id in those cases
instead of using the thread ID provided by the kernel events itself.
So this patch implements this approach, which then allows us to delete
the main_thread_id global.

gdb/testsuite:

	* windows-nat.c (main_thread_id): Delete.
	(handle_output_debug_string): Replace main_thread_id by
	current_event.dwThreadId.
	(fake_create_process): Likewise.
	(get_windows_debug_event) <CREATE_PROCESS_DEBUG_EVENT>:
	Do not set main_thread_id.
	<LOAD_DLL_DEBUG_EVENT>: Replace main_thread_id by
	current_event.dwThreadId.
	<UNLOAD_DLL_DEBUG_EVENT>: Likewise.

Tested on x86-windows and x86_64-windows using AdaCore's testsuite.
I'd like to observe this patch in our nightly testing environment
for at least a week before pushing it to master here. But it doesn't
preclude us from reviewing the patch, IMO.

Thoughts? OK for master (after a week of AdaCore testing)?

Thanks

Comments

Eli Zaretskii April 17, 2019, 2:31 p.m. UTC | #1
> From: Joel Brobecker <brobecker@adacore.com>
> Date: Tue, 16 Apr 2019 18:33:02 -0400
> 
> This global is meant to point to the "main" thread of execution of
> the program we are debugging. It is set when attaching to a process
> or when receiving a CREATE_PROCESS_DEBUG_EVENT event. The theory at
> the time was that this was also going to be the thread receiving
> the EXIT_PROCESS_DEBUG_EVENT event.
> 
> Unfortunately, we have discovered since then that this is actually
> not guaranteed. What this means in practice is that there is moderate
> risk that main_thread_id refers to a thread which no longer exists.
> 
> This global is used in 3 situations:
>   - OUTPUT_DEBUG_STRING_EVENT
>   - LOAD_DLL_DEBUG_EVENT
>   - UNLOAD_DLL_DEBUG_EVENT
> 
> It's not clear why we would need to use the main_thread_id in those cases
> instead of using the thread ID provided by the kernel events itself.
> So this patch implements this approach, which then allows us to delete
> the main_thread_id global.
> 
> gdb/testsuite:
> 
> 	* windows-nat.c (main_thread_id): Delete.
> 	(handle_output_debug_string): Replace main_thread_id by
> 	current_event.dwThreadId.
> 	(fake_create_process): Likewise.
> 	(get_windows_debug_event) <CREATE_PROCESS_DEBUG_EVENT>:
> 	Do not set main_thread_id.
> 	<LOAD_DLL_DEBUG_EVENT>: Replace main_thread_id by
> 	current_event.dwThreadId.
> 	<UNLOAD_DLL_DEBUG_EVENT>: Likewise.

Thanks, but isn't this throwing the baby with the bathwater, though?
AFAIK, the thread whose ID we get in the CREATE_PROCESS_DEBUG_EVENT
_is_ indeed the main thread, at least that's my reading of everything
I see about this on the Internet.  What might _not_ be true is that
the main thread exits last.  In fact, I could write a legitimate
Windows program whose main thread exited long before the process
terminated (although doing that in a console application that uses
'main' would be somewhat tricky at best).  Perhaps that's what happens
in the cases where you saw the problem.

AFAIU, the real problem with this global is that we delete the main
thread when its thread-termination notification is received, and we
also assume that EXIT_PROCESS_DEBUG_EVENT tells us that the main
thread exited.  But if this is the problem, then we could do one of
two things:

  . Keep the main_thread global, but use it just for avoiding the
    announcement of the main thread.

  . Install some logic that would avoid trying to delete a thread if
    its ID is equal to main_thread, when main_thread cannot be found
    in the list of threads.

This would allow us to avoid the crashes, but still not announce the
main thread as any other thread.

Does this make sense?
Joel Brobecker April 17, 2019, 5:38 p.m. UTC | #2
> > This global is meant to point to the "main" thread of execution of
> > the program we are debugging. It is set when attaching to a process
> > or when receiving a CREATE_PROCESS_DEBUG_EVENT event. The theory at
> > the time was that this was also going to be the thread receiving
> > the EXIT_PROCESS_DEBUG_EVENT event.
> > 
> > Unfortunately, we have discovered since then that this is actually
> > not guaranteed. What this means in practice is that there is moderate
> > risk that main_thread_id refers to a thread which no longer exists.
> > 
> > This global is used in 3 situations:
> >   - OUTPUT_DEBUG_STRING_EVENT
> >   - LOAD_DLL_DEBUG_EVENT
> >   - UNLOAD_DLL_DEBUG_EVENT
> > 
> > It's not clear why we would need to use the main_thread_id in those cases
> > instead of using the thread ID provided by the kernel events itself.
> > So this patch implements this approach, which then allows us to delete
> > the main_thread_id global.
> > 
> > gdb/testsuite:
> > 
> > 	* windows-nat.c (main_thread_id): Delete.
> > 	(handle_output_debug_string): Replace main_thread_id by
> > 	current_event.dwThreadId.
> > 	(fake_create_process): Likewise.
> > 	(get_windows_debug_event) <CREATE_PROCESS_DEBUG_EVENT>:
> > 	Do not set main_thread_id.
> > 	<LOAD_DLL_DEBUG_EVENT>: Replace main_thread_id by
> > 	current_event.dwThreadId.
> > 	<UNLOAD_DLL_DEBUG_EVENT>: Likewise.
> 
> Thanks, but isn't this throwing the baby with the bathwater, though?
> AFAIK, the thread whose ID we get in the CREATE_PROCESS_DEBUG_EVENT
> _is_ indeed the main thread, at least that's my reading of everything
> I see about this on the Internet.  What might _not_ be true is that
> the main thread exits last.  In fact, I could write a legitimate
> Windows program whose main thread exited long before the process
> terminated (although doing that in a console application that uses
> 'main' would be somewhat tricky at best).  Perhaps that's what happens
> in the cases where you saw the problem.

It happens completely randomly, and in all cases, the program itself
has a main program which waits for all software threads to complete
before itself terminating. Not sure what Windows is doing under
the hood.

> AFAIU, the real problem with this global is that we delete the main
> thread when its thread-termination notification is received, and we
> also assume that EXIT_PROCESS_DEBUG_EVENT tells us that the main
> thread exited.  But if this is the problem, then we could do one of
> two things:
> 
>   . Keep the main_thread global, but use it just for avoiding the
>     announcement of the main thread.
> 
>   . Install some logic that would avoid trying to delete a thread if
>     its ID is equal to main_thread, when main_thread cannot be found
>     in the list of threads.
> 
> This would allow us to avoid the crashes, but still not announce the
> main thread as any other thread.

The crash is already avoided by patch #1, which does _not_ delete
the global.

The reason why I want to delete the global is because it references
a thread that the Windows kernel told us does not exist anymore,
and I fear that keeping it around will cause us to pass it back
to the Windows kernel, thus triggering an error. As far as I can
tell from our uses of it as part of pure event handling, I don't see
why we would need that global at all. So, rather than stay on shaky
grounds regarding that question, I prefer we bite the bullet and
try to do it right.

The second question is about the main thread creation/exit
notifications. At the moment, what we are relying on to emit or
silence the notification is the kind of event we received.
So, for CREATE_PROCESS_DEBUG_EVENT/EXIT_PROCESS_DEBUG_EVENT events,
we silence the notification. Unfortunately, we have now discovered
that does not work, because EXIT_PROCESS_DEBUG_EVENT doesn't
always point the same thread as CREATE_PROCESS_DEBUG_EVENT.
If I understand you correctly, you want to preserve the global
so we can avoid those notifications. In my opinion, doing that
just for the purpose of filtering one notification is simply
not worth the trouble of keeping a global around.  Generally speaking,
we tend to try to avoid globals as much as we can, as they are hard to
track, and makes the code harder to understand and maintain.
Modifying the logic to use the main_thread_id global instead of
the kind of event is a step backwards, in my opinion.
Eli Zaretskii April 17, 2019, 6:29 p.m. UTC | #3
> Date: Wed, 17 Apr 2019 10:38:42 -0700
> From: Joel Brobecker <brobecker@adacore.com>
> Cc: gdb-patches@sourceware.org
> 
> It happens completely randomly, and in all cases, the program itself
> has a main program which waits for all software threads to complete
> before itself terminating. Not sure what Windows is doing under
> the hood.

At program termination time, all kinds of irregularities could happen,
especially if the program doesn't exit by exiting 'main' -- because
AFAIK doing the latter causes CRT to explicitly shut down all the
other threads.

But there's little doubt in my mind that the thread whose ID is
reported by CREATE_PROCESS_DEBUG_EVENT is the main thread.

> If I understand you correctly, you want to preserve the global
> so we can avoid those notifications. In my opinion, doing that
> just for the purpose of filtering one notification is simply
> not worth the trouble of keeping a global around.  Generally speaking,
> we tend to try to avoid globals as much as we can, as they are hard to
> track, and makes the code harder to understand and maintain.
> Modifying the logic to use the main_thread_id global instead of
> the kind of event is a step backwards, in my opinion.

If the only problem is having the global variable, we could mark the
main thread in some other way, that won't need the global.

But yes, I think not announcing the main thread is a nice feature, and
losing it due to the mess that happens at program exit (something to
which most users won't pay attention anyway) would be a pity.  I'd
even agree to announcing the exit of the main thread, if we cannot
come up with a better solution, if that could allow us avoid its
announcement at program startup.

The use case I have in mind is debugging a single-threaded program.
It is jarring enough, when debugging such a program, to see GDB
announcing threads started by the OS.  It would be nice if we could
avoid announcing the main thread.

Thanks.
Joel Brobecker April 17, 2019, 10:17 p.m. UTC | #4
> > If I understand you correctly, you want to preserve the global
> > so we can avoid those notifications. In my opinion, doing that
> > just for the purpose of filtering one notification is simply
> > not worth the trouble of keeping a global around.  Generally speaking,
> > we tend to try to avoid globals as much as we can, as they are hard to
> > track, and makes the code harder to understand and maintain.
> > Modifying the logic to use the main_thread_id global instead of
> > the kind of event is a step backwards, in my opinion.
> 
> If the only problem is having the global variable, we could mark the
> main thread in some other way, that won't need the global.
> 
> But yes, I think not announcing the main thread is a nice feature, and
> losing it due to the mess that happens at program exit (something to
> which most users won't pay attention anyway) would be a pity.  I'd
> even agree to announcing the exit of the main thread, if we cannot
> come up with a better solution, if that could allow us avoid its
> announcement at program startup.
> 
> The use case I have in mind is debugging a single-threaded program.
> It is jarring enough, when debugging such a program, to see GDB
> announcing threads started by the OS.  It would be nice if we could
> avoid announcing the main thread.

Personally, I feel the other way, especially since we always get
all sorts of random temporary thread notifications polutting the
output, that I don't mind the extra notification. And that way,
I know when each thread is actually terminated, and with what
error code.

But if you and others feel differently, I can switch my approach:

  - patch #1 (master+8.3): Just fix the crash, and nothing more.

        This means that, on 8.3, thread creation and exit notifications
        can be slightly off; this is considered an acceptable known
        problem, and is not a regression.

        This is actually what we've put in AdaCore's GDB as a first
        step to see we had any issue after that, and we did not.

  - patch #2 (master only): Remove uses of the global

        But we keep the filtering of the "main" thread. We still
        the known problem that the thread-exit notifications might
        be silenced for the wrong thread. Not a regression.

  - patch #3 (master only): Tag the main thread, and silence
    the thread-exit notification of the thread that got tagged.
    Depending on the implementation, this patch would potentially
    be acceptable for 8.3, but might need extra work to apply.
    I think we've had enough problems with 8.3 as it is, I don't
    want to add more things to do, so I vote OUT for 8.3.

I don't think I have time for that before the weekend or next week,
though.
Eli Zaretskii April 18, 2019, 12:52 p.m. UTC | #5
> Date: Wed, 17 Apr 2019 15:17:29 -0700
> From: Joel Brobecker <brobecker@adacore.com>
> Cc: gdb-patches@sourceware.org
> 
> Personally, I feel the other way, especially since we always get
> all sorts of random temporary thread notifications polutting the
> output, that I don't mind the extra notification. And that way,
> I know when each thread is actually terminated, and with what
> error code.
> 
> But if you and others feel differently, I can switch my approach:
> 
>   - patch #1 (master+8.3): Just fix the crash, and nothing more.
> 
>         This means that, on 8.3, thread creation and exit notifications
>         can be slightly off; this is considered an acceptable known
>         problem, and is not a regression.
> 
>         This is actually what we've put in AdaCore's GDB as a first
>         step to see we had any issue after that, and we did not.
> 
>   - patch #2 (master only): Remove uses of the global
> 
>         But we keep the filtering of the "main" thread. We still
>         the known problem that the thread-exit notifications might
>         be silenced for the wrong thread. Not a regression.
> 
>   - patch #3 (master only): Tag the main thread, and silence
>     the thread-exit notification of the thread that got tagged.
>     Depending on the implementation, this patch would potentially
>     be acceptable for 8.3, but might need extra work to apply.
>     I think we've had enough problems with 8.3 as it is, I don't
>     want to add more things to do, so I vote OUT for 8.3.
> 
> I don't think I have time for that before the weekend or next week,
> though.

Mine is just one opinion.  I think we should solicit opinions from
others before you invest all that effort.  If I'm the only one who is
bothered by your original proposal, I think you should go ahead and
push it.

Thanks.
Joel Brobecker April 18, 2019, 2:54 p.m. UTC | #6
> > Personally, I feel the other way, especially since we always get
> > all sorts of random temporary thread notifications polutting the
> > output, that I don't mind the extra notification. And that way,
> > I know when each thread is actually terminated, and with what
> > error code.
> > 
> > But if you and others feel differently, I can switch my approach:
> > 
> >   - patch #1 (master+8.3): Just fix the crash, and nothing more.
> > 
> >         This means that, on 8.3, thread creation and exit notifications
> >         can be slightly off; this is considered an acceptable known
> >         problem, and is not a regression.
> > 
> >         This is actually what we've put in AdaCore's GDB as a first
> >         step to see we had any issue after that, and we did not.
> > 
> >   - patch #2 (master only): Remove uses of the global
> > 
> >         But we keep the filtering of the "main" thread. We still
> >         the known problem that the thread-exit notifications might
> >         be silenced for the wrong thread. Not a regression.
> > 
> >   - patch #3 (master only): Tag the main thread, and silence
> >     the thread-exit notification of the thread that got tagged.
> >     Depending on the implementation, this patch would potentially
> >     be acceptable for 8.3, but might need extra work to apply.
> >     I think we've had enough problems with 8.3 as it is, I don't
> >     want to add more things to do, so I vote OUT for 8.3.
> > 
> > I don't think I have time for that before the weekend or next week,
> > though.
> 
> Mine is just one opinion.  I think we should solicit opinions from
> others before you invest all that effort.  If I'm the only one who is
> bothered by your original proposal, I think you should go ahead and
> push it.

Since I think you use the debugger on Windows more than I personally do,
I'll go the other way. Unless someone else tips the balance the other way,
I'll go with your opinion over mine ;-).
Pedro Alves April 18, 2019, 4:27 p.m. UTC | #7
(I should have read this thread before replying on the other one.)

On 4/17/19 11:17 PM, Joel Brobecker wrote:
> 
>   - patch #3 (master only): Tag the main thread, and silence
>     the thread-exit notification of the thread that got tagged.
>     Depending on the implementation, this patch would potentially
>     be acceptable for 8.3, but might need extra work to apply.
>     I think we've had enough problems with 8.3 as it is, I don't
>     want to add more things to do, so I vote OUT for 8.3.

This doesn't seem right to me, given that the main thread may
legitimately exit before other threads.  If this approach is taken,
maybe tweak it a bit:

silence the thread-exit notification of the thread that got tagged
_iff there are no other threads in the process_.

Thanks,
Pedro Alves
Joel Brobecker April 19, 2019, 8:43 p.m. UTC | #8
> >   - patch #3 (master only): Tag the main thread, and silence
> >     the thread-exit notification of the thread that got tagged.
> >     Depending on the implementation, this patch would potentially
> >     be acceptable for 8.3, but might need extra work to apply.
> >     I think we've had enough problems with 8.3 as it is, I don't
> >     want to add more things to do, so I vote OUT for 8.3.
> 
> This doesn't seem right to me, given that the main thread may
> legitimately exit before other threads.  If this approach is taken,
> maybe tweak it a bit:
> 
> silence the thread-exit notification of the thread that got tagged
> _iff there are no other threads in the process_.

Wouldn't that be confusing? What would happen in that case is that
we end up getting a a thread-exit notification with a thread ID
for which there hasn't been a new-thread notification. Without
knowing the implementation details of how GDB decides to silence
or not thread notifications, people are going to ask themselves
where the thread that exited came from, and why we didn't see
a new-thread notification for it before.

To decide which approach we want to take - what would be the issue
if we decided to silence the main thread's exit notification while
some other threads might still be alive? Assuming Pedro is making
a valid point, I think we should go back to pre-8.3 behavior, and
show all thread create/exit notifications, including for the main
thread.
Pedro Alves April 22, 2019, 2:24 p.m. UTC | #9
On 4/19/19 9:43 PM, Joel Brobecker wrote:
>>>   - patch #3 (master only): Tag the main thread, and silence
>>>     the thread-exit notification of the thread that got tagged.
>>>     Depending on the implementation, this patch would potentially
>>>     be acceptable for 8.3, but might need extra work to apply.
>>>     I think we've had enough problems with 8.3 as it is, I don't
>>>     want to add more things to do, so I vote OUT for 8.3.
>>
>> This doesn't seem right to me, given that the main thread may
>> legitimately exit before other threads.  If this approach is taken,
>> maybe tweak it a bit:
>>
>> silence the thread-exit notification of the thread that got tagged
>> _iff there are no other threads in the process_.
> 
> Wouldn't that be confusing? What would happen in that case is that
> we end up getting a a thread-exit notification with a thread ID
> for which there hasn't been a new-thread notification. Without
> knowing the implementation details of how GDB decides to silence
> or not thread notifications, people are going to ask themselves
> where the thread that exited came from, and why we didn't see
> a new-thread notification for it before.
> 
> To decide which approach we want to take - what would be the issue
> if we decided to silence the main thread's exit notification while
> some other threads might still be alive?

In my view, it's OK to not let the user know the main thread is
exiting when it's the last thread in the process because the process
is exiting right away.  But if the main thread is not the last
thread, then the process is going to stay around.  And in that case,
if a user wants thread exit notifications, I think it just makes sense
to let the user know a thread exited, more so since the main thread of
a program tends to be its most "important" thread.

I don't see the asymmetry (silent at creation, not silent at destruction)
in this particular case as surprising -- when the program is
starting up, there's only one thread, and I don't think users are
surprised to find that after running to main there's already one
thread in the thread list.

I also think that if we included the GDB thread id (and name) in thread creation/exited
notifications, similar to the output of inferior start/exit notifications, then users
would be even less surprised with the asymmetry, as users could clearly tell that
it's the main thread that is exiting:

Currently:

 [Thread 48192.0xd100 exited with code 1]

With gdb thread id and name:

[Thread 1.1 (48192.0xd100 "the thread's name") exited with code 1]

I think that would be a good overall change, regardless.

> Assuming Pedro is making
> a valid point, I think we should go back to pre-8.3 behavior, and
> show all thread create/exit notifications, including for the main
> thread.

Thanks,
Pedro Alves
André Pönitz April 22, 2019, 3:23 p.m. UTC | #10
On Mon, Apr 22, 2019 at 03:24:46PM +0100, Pedro Alves wrote:
> I don't see the asymmetry (silent at creation, not silent at destruction)
> in this particular case as surprising [...]

It's not surprising, but it's inconsistent and causes extra work in
cases one wants to mechanically examine such output.

Andre'
Pedro Alves April 22, 2019, 5:28 p.m. UTC | #11
On 4/22/19 4:23 PM, André Pönitz wrote:
> On Mon, Apr 22, 2019 at 03:24:46PM +0100, Pedro Alves wrote:
>> I don't see the asymmetry (silent at creation, not silent at destruction)
>> in this particular case as surprising [...]
> 
> It's not surprising, but it's inconsistent 
> and causes extra work in
> cases one wants to mechanically examine such output.

I'm curious about what extra work you're thinking of.  Can't think of
anything significant.

But do note that this discussion does not affect MI.

So what do you propose?  My proposal kept these two ideas in mind:

- be quiet wrt to threads with single-threaded programs.
- inform the user the main thread exited while other threads stay running.

Thanks,
Pedro Alves
André Pönitz April 22, 2019, 9:46 p.m. UTC | #12
On Mon, Apr 22, 2019 at 06:28:55PM +0100, Pedro Alves wrote:
> On 4/22/19 4:23 PM, André Pönitz wrote:
> > On Mon, Apr 22, 2019 at 03:24:46PM +0100, Pedro Alves wrote:
> >> I don't see the asymmetry (silent at creation, not silent at destruction)
> >> in this particular case as surprising [...]
> > 
> > It's not surprising, but it's inconsistent 
> > and causes extra work in
> > cases one wants to mechanically examine such output.
> 
> I'm curious about what extra work you're thinking of.

As far as I understand it, any (non-MI) user wanting to keep track of
running threads would need to find out which events are "not surprising"
(for which version of GDB...) and mentally (or in code) "generate"
the "missing" events. That's a kind of "work". 

> Can't think of anything significant.

It's not significant, but less surprising to me personally, i.e.
violating the principle of least astonishment.
 
> But do note that this discussion does not affect MI.

I understand that, and for thread related notifications MI is sufficient
for my frontend purposes, but in general MI does not cover all functionality
of CLI and already this discrepancy is a surprise to some degree. I'd
rather see the differences go away over time, not to increase.

> So what do you propose?  My proposal kept these two ideas in mind:
> 
> - be quiet wrt to threads with single-threaded programs.

TBH, I don't see the necessity of this requirement. That would be two
lines of noise per run. 

> - inform the user the main thread exited while other threads stay running.

Also here, I don't see enough gain in one suppressed line of output
to compensate for the inconsistency, this time even with a CLI user hat on.

Andre'
Eli Zaretskii April 23, 2019, 5:54 a.m. UTC | #13
> Date: Mon, 22 Apr 2019 23:46:07 +0200
> From: André Pönitz <apoenitz@t-online.de>
> Cc: Joel Brobecker <brobecker@adacore.com>, Eli Zaretskii <eliz@gnu.org>,
> 	gdb-patches@sourceware.org
> 
> > - be quiet wrt to threads with single-threaded programs.
> 
> TBH, I don't see the necessity of this requirement. That would be two
> lines of noise per run. 

It violates the same principle of least astonishment, in the important
use case of a program that has only one application thread, the main
thread.

And please also keep in mind that we are not talking about introducing
a new feature, we are talking about preserving an existing feature, in
a way that will avoid crashing GDB in some rare cases.  It sounds
wrong to me to lose a feature because its implementation causes
problems in a rare use case; we should instead fix the implementation.

> > - inform the user the main thread exited while other threads stay running.
> 
> Also here, I don't see enough gain in one suppressed line of output
> to compensate for the inconsistency, this time even with a CLI user hat on.

Forgive me for asking, but do you frequently debug with GDB on
Windows?  Because this issue is specific to native Windows debugging,
it won't affect any other platforms.  On Windows, we already have
threads popping up and exiting that are started by the OS beyond our
control, which is already quite jarring even to me, certainly to
someone who doesn't debug day in and day out.  Having the main thread
of my program add more noise is not insignificant, because ideally I
don't want to see any of my application threads shown unless my
program really started a thread.

Thanks.
Pedro Alves April 23, 2019, 11:41 a.m. UTC | #14
On 4/22/19 10:46 PM, André Pönitz wrote:
> On Mon, Apr 22, 2019 at 06:28:55PM +0100, Pedro Alves wrote:
>> On 4/22/19 4:23 PM, André Pönitz wrote:
>>> On Mon, Apr 22, 2019 at 03:24:46PM +0100, Pedro Alves wrote:
>>>> I don't see the asymmetry (silent at creation, not silent at destruction)
>>>> in this particular case as surprising [...]
>>>
>>> It's not surprising, but it's inconsistent 
>>> and causes extra work in
>>> cases one wants to mechanically examine such output.
>>
>> I'm curious about what extra work you're thinking of.
> 
> As far as I understand it, any (non-MI) user wanting to keep track of
> running threads would need to find out which events are "not surprising"
> (for which version of GDB...) and mentally (or in code) "generate"
> the "missing" events. That's a kind of "work". 

gdb on GNU/Linux already works like I'm suggesting, and has been doing it
for a few years, and I don't think I recall ever hearing complaints about it:

Simple single-threaded "hello world" program:

 (gdb) r
 Starting program: /home/pedro/brno/pedro/gdb/tests/main 
 [Inferior 1 (process 13298) exited normally]
 (gdb) 

Multi-threaded program that exits normally (gdb.threads/schedlock.exp testcase):

 (gdb) r
 Starting program: build/gdb/testsuite/outputs/gdb.threads/schedlock/schedlock 
 [Thread debugging using libthread_db enabled]
 Using host libthread_db library "/lib64/libthread_db.so.1".
 [New Thread 0x7ffff74b8700 (LWP 13984)]            <<< extra thread
 [Thread 0x7ffff74b8700 (LWP 13984) exited]         <<< extra thread exits
 [Inferior 1 (process 13980) exited normally]
 (gdb) 

Threaded program that has the leader thread exit with pthread_exit while
another thread keeps running (gdb.threads/leader-exit.exp testcase):

 (gdb) r
 Starting program: build/gdb/testsuite/outputs/gdb.threads/leader-exit/leader-exit 
 [Thread debugging using libthread_db enabled]
 Using host libthread_db library "/lib64/libthread_db.so.1".
 [New Thread 0x7ffff74b8700 (LWP 13230)]               <<<< extra thread
 [Thread 0x7ffff7fb6740 (LWP 13226) exited]            <<<< leader thread exits (notice LWP == process ID)
 [Inferior 1 (process 13226) exited normally]
 (gdb) 

Note there's some time between the leader exiting, and the process exiting.

If the output were changed like I was suggesting, we'd have something like this,
which I think would be much clearer:

 (gdb) r
 Starting program: build/gdb/testsuite/outputs/gdb.threads/schedlock/schedlock 
 [Thread debugging using libthread_db enabled]
 Using host libthread_db library "/lib64/libthread_db.so.1".
 [New Thread 1.2 (0x7ffff74b8700 (LWP 13984))]
 [Thread 1.2 (0x7ffff74b8700 (LWP 13984)) exited]
 [Inferior 1 (process 13980) exited normally]
 (gdb)

And:

 (gdb) r
 Starting program: build/gdb/testsuite/outputs/gdb.threads/leader-exit/leader-exit 
 [Thread debugging using libthread_db enabled]
 Using host libthread_db library "/lib64/libthread_db.so.1".
 [New Thread 1.2 (0x7ffff74b8700 (LWP 13230))]
 [Thread 1.1 (0x7ffff7fb6740 (LWP 13226)) exited]
 [Inferior 1 (process 13226) exited normally]
 (gdb) 

We're going on a tangent, but maybe even say "Thread ... appeared" instead
of "New Thread", for better visual alignment, like:

 (gdb) r
 Starting program: build/gdb/testsuite/outputs/gdb.threads/schedlock/schedlock 
 [Thread debugging using libthread_db enabled]
 Using host libthread_db library "/lib64/libthread_db.so.1".
 [Thread 1.2 (0x7ffff74b8700 (LWP 13984)) appeared]
 [Thread 1.2 (0x7ffff74b8700 (LWP 13984)) exited]
 [Inferior 1 (process 13980) exited normally]
 (gdb)

> 
>> Can't think of anything significant.
> 
> It's not significant, but less surprising to me personally, i.e.
> violating the principle of least astonishment.
>  
>> But do note that this discussion does not affect MI.
> 
> I understand that, and for thread related notifications MI is sufficient
> for my frontend purposes, but in general MI does not cover all functionality
> of CLI and already this discrepancy is a surprise to some degree. I'd
> rather see the differences go away over time, not to increase.
> 
>> So what do you propose?  My proposal kept these two ideas in mind:
>>
>> - be quiet wrt to threads with single-threaded programs.
> 
> TBH, I don't see the necessity of this requirement. That would be two
> lines of noise per run. 

I think I had originally proposed that when I switched gdb's model
to "always a thread", but got pushed back.  And I do think the pushback made
sense, particularly when you consider remote debugging targets that don't even
have a concept of threads.

I kind of convinced myself by considering bare metal targets, but also considering
further "execution unit" levels/layers -- say gdb adds native support for fibers, and
the model is that the initial thread has an initial fiber too -- i.e., there's always
a fiber, like there's always a thread.  Would we want to automatically announce the
initial fiber as soon as the process starts, even if the process doesn't make
use of fibers?

 (gdb) r
 Starting program: ...
 [New Thread 1.1 ...]
 [New Fiber 1.1 ...]
 .....
 [Fiber 1.1 ... exited]
 [Thread 1.1 exited]
 [Inferior 1 (process 13226) exited normally]
 (gdb) 

Or, would we rather be quiet about fibers until something creates a second one,
similarly to how we are quiet about threads until a second thread is created?

And fibers would be one kind of unit of execution.  There may be others.
What to do then?

> 
>> - inform the user the main thread exited while other threads stay running.
> 
> Also here, I don't see enough gain in one suppressed line of output
> to compensate for the inconsistency, this time even with a CLI user hat on.
Thanks,
Pedro Alves
André Pönitz April 24, 2019, 8:13 p.m. UTC | #15
On Tue, Apr 23, 2019 at 08:54:04AM +0300, Eli Zaretskii wrote:
> > Also here, I don't see enough gain in one suppressed line of output
> > to compensate for the inconsistency, this time even with a CLI user hat on.
> 
> Forgive me for asking, but do you frequently debug with GDB on
> Windows? 

Not frequently, and if so, only for testing purposes. 

But I do get regular feedback from a handful downstream users.

> Because this issue is specific to native Windows debugging,
> it won't affect any other platforms.

The general pattern (subjective inconsistency in GDB output - and 
input when I think about it) is, fortunately or not, cross-platform.

> On Windows, we already have
> threads popping up and exiting that are started by the OS beyond our
> control, which is already quite jarring even to me,

[It's not just you in this case...]

> certainly to someone who doesn't debug day in and day out.  Having
> the main thread of my program add more noise is not insignificant,
> because ideally I don't want to see any of my application threads
> shown unless my program really started a thread.

To be honest, I have still difficulties to find a mental model on what
irks a typical Windows users. On one hand, they can apparently stand
developing on that platform, on the other hand, two lines of extra
output, even issued for consistency reasons, appear to be a show
stopper. 

Andre'
Eli Zaretskii April 25, 2019, 5:39 a.m. UTC | #16
> Date: Wed, 24 Apr 2019 22:13:22 +0200
> From: André Pönitz <apoenitz@t-online.de>
> Cc: palves@redhat.com, brobecker@adacore.com, gdb-patches@sourceware.org
> 
> To be honest, I have still difficulties to find a mental model on what
> irks a typical Windows users. On one hand, they can apparently stand
> developing on that platform, on the other hand, two lines of extra
> output, even issued for consistency reasons, appear to be a show
> stopper.

That was sarcasm, I guess?

In any case, I'm not sure why you think developing for Windows in the
relevant use cases is something that needs some super-natural
endurance capabilities: I develop using Emacs, GCC, Binutils, GDB,
Grep, Find, and all the other tools you are familiar with.  It's very
similar to developing on a Posix platform.  GDB doesn't support debug
info in formats emitted by Windows compilers, it only supports
GCC-emitted info, AFAIK.

And two lines of extra output you don't expect are not a show-stopper,
but they do make you stop and think for a moment whether this is
something your code did.
diff mbox

Patch

diff --git a/gdb/windows-nat.c b/gdb/windows-nat.c
index 039eb13..d188ed0 100644
--- a/gdb/windows-nat.c
+++ b/gdb/windows-nat.c
@@ -237,7 +237,6 @@  static DEBUG_EVENT current_event;	/* The current debug event from
 					   WaitForDebugEvent */
 static HANDLE current_process_handle;	/* Currently executing process */
 static windows_thread_info *current_thread;	/* Info on currently selected thread */
-static DWORD main_thread_id;		/* Thread ID of the main thread */
 
 /* Counts of things.  */
 static int exception_count = 0;
@@ -1026,7 +1025,7 @@  handle_output_debug_string (struct target_waitstatus *ourstatus)
 	  ourstatus->kind = TARGET_WAITKIND_STOPPED;
 	  retval = strtoul (p, &p, 0);
 	  if (!retval)
-	    retval = main_thread_id;
+	    retval = current_event.dwThreadId;
 	  else if ((x = (LPCVOID) (uintptr_t) strtoull (p, NULL, 0))
 		   && ReadProcessMemory (current_process_handle, x,
 					 &saved_context,
@@ -1402,13 +1401,12 @@  fake_create_process (void)
        (unsigned) GetLastError ());
       /*  We can not debug anything in that case.  */
     }
-  main_thread_id = current_event.dwThreadId;
   current_thread
     = windows_add_thread (ptid_t (current_event.dwProcessId, 0,
 				  current_event.dwThreadId),
 			  current_event.u.CreateThread.hThread,
 			  current_event.u.CreateThread.lpThreadLocalBase);
-  return main_thread_id;
+  return current_event.dwThreadId;
 }
 
 void
@@ -1604,7 +1602,6 @@  get_windows_debug_event (struct target_ops *ops,
 	break;
 
       current_process_handle = current_event.u.CreateProcessInfo.hProcess;
-      main_thread_id = current_event.dwThreadId;
       /* Add the main thread.  */
       th = windows_add_thread
         (ptid_t (current_event.dwProcessId, 0,
@@ -1648,7 +1645,7 @@  get_windows_debug_event (struct target_ops *ops,
       catch_errors (handle_load_dll);
       ourstatus->kind = TARGET_WAITKIND_LOADED;
       ourstatus->value.integer = 0;
-      thread_id = main_thread_id;
+      thread_id = current_event.dwThreadId;
       break;
 
     case UNLOAD_DLL_DEBUG_EVENT:
@@ -1661,7 +1658,7 @@  get_windows_debug_event (struct target_ops *ops,
       catch_errors (handle_unload_dll);
       ourstatus->kind = TARGET_WAITKIND_LOADED;
       ourstatus->value.integer = 0;
-      thread_id = main_thread_id;
+      thread_id = current_event.dwThreadId;
       break;
 
     case EXCEPTION_DEBUG_EVENT: