[v2/htab,4/6,Linux] Optimize PID -> struct lwp_info lookup
Commit Message
On 05/23/2016 02:12 PM, Yao Qi wrote:
> Pedro Alves <palves@redhat.com> writes:
>
>> Fix this by keeping a list of LWPs sorted by PID, so we can use binary
>> searches for lookup.
>
> Is it better to use htab?
>
I had assumed the usage pattern would be clearly in favor of
sorted vector, but I tried a htab now, and at least for sets of
threads of the sizes I was looking at (2k -> 10k), numbers show
otherwise.
attach-many-short-lived-threads has threads come and go
too quickly for it to be easy to tell any difference -- the
number of threads running at any given time isn't very stable.
So I hacked the test further to make it spawn threads, and
then have the threads just sleep. This provides a different
scenario, one where we always iterate over the /proc/PID/task/
lwp list exactly twice, so may give different results, but
it probably doesn't really matter -- the difference is probably
not that significant.
So, with a process that spawns many threads and just
has them all sleep, I timed attaching + detach with:
$ time gdb -iex "set print thread-events off" --batch -q -p PID
I ran that 3 times for 4000 threads and 8000 threads, with
either using htab, and with a sorted vec + bsearch. The results
were:
8000 threads, htab:
| run | 1 | 2 | 3 |
|------+-----------+-----------+-----------|
| real | 0m29.831s | 0m31.483s | 0m30.333s |
| user | 0m20.548s | 0m22.124s | 0m21.143s |
| sys | 0m9.291s | 0m9.366s | 0m9.197s |
8000 threads, sorted vec:
| run | 1 | 2 | 3 |
|------+-----------+-----------+-----------|
| real | 0m33.569s | 0m32.178s | 0m32.480s |
| user | 0m24.027s | 0m22.981s | 0m23.344s |
| sys | 0m9.549s | 0m9.207s | 0m9.146s |
4000 threads, htab:
| run | 1 | 2 | 3 |
|------+----------+----------+----------|
| real | 0m6.710s | 0m6.683s | 0m6.891s |
| user | 0m4.561s | 0m4.525s | 0m4.593s |
| sys | 0m2.159s | 0m2.168s | 0m2.307s |
4000 threads, sorted vec:
| run | 1 | 2 | 3 |
|------+----------+----------+----------|
| real | 0m7.282s | 0m7.375s | 0m7.282s |
| user | 0m5.132s | 0m5.173s | 0m5.080s |
| sys | 0m2.162s | 0m2.210s | 0m2.211s |
So given these numbers, and taking the "default to hash tab
unless you have a good reason not to" principle, I agree that's
it's better.
So here's v2, using a htab.
From f065eca4c841eddd2f917c5d2702c525a0065338 Mon Sep 17 00:00:00 2001
From: Pedro Alves <palves@redhat.com>
Date: Mon, 23 May 2016 14:38:58 +0100
Subject: [PATCH v2] [Linux] Optimize PID -> struct lwp_info lookup
Hacking the gdb.threads/attach-many-short-lived-threads.exp test to
spawn thousands of threads instead of dozens, and running gdb under
perf, I saw that GDB was spending most of the time in find_lwp_pid:
- captured_main
- 93.61% catch_command_errors
- 87.41% attach_command
- 87.40% linux_nat_attach
- 87.40% linux_proc_attach_tgid_threads
- 82.38% attach_proc_task_lwp_callback
- 81.01% find_lwp_pid
5.30% ptid_get_lwp
+ 0.10% ptid_lwp_p
+ 0.64% add_thread
+ 0.26% set_running
+ 0.24% set_executing
0.12% ptid_get_lwp
+ 0.01% ptrace
+ 0.01% add_lwp
attach_proc_task_lwp_callback is called once for each LWP that we
attach to, found by listing the /proc/PID/task/ directory. In turn,
attach_proc_task_lwp_callback calls find_lwp_pid to check whether the
LWP we're about to try to attach to is already known. Since
find_lwp_pid does a linear walk over the whole LWP list, this becomes
quadratic. We do the /proc/PID/task/ listing until we get two
iterations in a row where we found no new threads. So the second and
following times we walk the /proc/PID/task/ dir, we're going to take
an even worse find_lwp_pid hit.
Fix this by adding a hash table keyed by LWP PID, for fast lookup.
The linked list embedded in the LWP structure itself is kept, and made
a double-linked list, so that removals from that list are O(1). An
earlier version of this patch got rid of this list altogether, but
that revealed hidden dependencies / assumptions on how the list is
sorted. For example, killing a process and then waiting for all the
LWPs status using iterate_over_lwps only works as is because the
leader LWP is always last in the list. So I thought it better to take
an incremental approach and make this patch concern itself _only_ with
the PID lookup optimization.
gdb/ChangeLog:
yyyy-mm-dd Pedro Alves <palves@redhat.com>
PR gdb/19828
* linux-nat.c (lwp_lwpid_htab): New htab.
(lwp_info_hash, lwp_lwpid_htab_eq, lwp_lwpid_htab_create)
(lwp_lwpid_htab_add_lwp): New functions.
(lwp_list): Tweak comment.
(lwp_list_add, lwp_list_remove, lwp_lwpid_htab_remove_pid): New
functions.
(purge_lwp_list): Rewrite, using htab_traverse_noresize.
(add_initial_lwp): Add lwp to htab too. Use lwp_list_add.
(delete_lwp): Use lwp_list_remove. Remove htab too.
(find_lwp_pid): Search in htab.
(_initialize_linux_nat): Call lwp_lwpid_htab_create.
* linux-nat.h (struct lwp_info) <prev>: New field.
---
gdb/linux-nat.c | 158 ++++++++++++++++++++++++++++++++++++++++++--------------
gdb/linux-nat.h | 4 +-
2 files changed, 123 insertions(+), 39 deletions(-)
Comments
Pedro Alves <palves@redhat.com> writes:
Hi Pedro,
Patch is good to me, a nit on comments,
> +/* Head of double-linked list of known LWPs. Sorted by reverse
> + creation order. This order is assumed in some cases. E.g.,
> + reaping status after killing alls lwps of a process: the leader LWP
> + must be reaped last. */
> struct lwp_info *lwp_list;
> +
> +/* Add LP to sorted-by-creation-order double-linked list. */
> +
To reflect the code,
s/sorted-by-creation-order/sorted-by-reverse-creation-order/
> +
> +/* Remove LP from sorted-by-creation-order double-linked list. */
> +
Likewise.
>
> - /* Next LWP in list. */
> + /* Previous and next pointers in double-linked list of known LWPs,
> + sorted by reverse creation order. */
> + struct lwp_info *prev;
On 05/24/2016 10:33 AM, Yao Qi wrote:
> Pedro Alves <palves@redhat.com> writes:
>
> Hi Pedro,
> Patch is good to me, a nit on comments,
>
Thanks, I fixed them.
Thanks,
Pedro Alves
@@ -677,8 +677,85 @@ linux_child_set_syscall_catchpoint (struct target_ops *self,
return 0;
}
-/* List of known LWPs. */
+/* List of known LWPs, keyed by LWP PID. This speeds up the common
+ case of mapping a PID returned from the kernel to our corresponding
+ lwp_info data structure. */
+static htab_t lwp_lwpid_htab;
+
+/* Calculate a hash from a lwp_info's LWP PID. */
+
+static hashval_t
+lwp_info_hash (const void *ap)
+{
+ const struct lwp_info *lp = (struct lwp_info *) ap;
+ pid_t pid = ptid_get_lwp (lp->ptid);
+
+ return iterative_hash_object (pid, 0);
+}
+
+/* Equality function for the lwp_info hash table. Compares the LWP's
+ PID. */
+
+static int
+lwp_lwpid_htab_eq (const void *a, const void *b)
+{
+ const struct lwp_info *entry = (const struct lwp_info *) a;
+ const struct lwp_info *element = (const struct lwp_info *) b;
+
+ return ptid_get_lwp (entry->ptid) == ptid_get_lwp (element->ptid);
+}
+
+/* Create the lwp_lwpid_htab hash table. */
+
+static void
+lwp_lwpid_htab_create (void)
+{
+ lwp_lwpid_htab = htab_create (100, lwp_info_hash, lwp_lwpid_htab_eq, NULL);
+}
+
+/* Add LP to the hash table. */
+
+static void
+lwp_lwpid_htab_add_lwp (struct lwp_info *lp)
+{
+ void **slot;
+
+ slot = htab_find_slot (lwp_lwpid_htab, lp, INSERT);
+ gdb_assert (slot != NULL && *slot == NULL);
+ *slot = lp;
+}
+
+/* Head of double-linked list of known LWPs. Sorted by reverse
+ creation order. This order is assumed in some cases. E.g.,
+ reaping status after killing alls lwps of a process: the leader LWP
+ must be reaped last. */
struct lwp_info *lwp_list;
+
+/* Add LP to sorted-by-creation-order double-linked list. */
+
+static void
+lwp_list_add (struct lwp_info *lp)
+{
+ lp->next = lwp_list;
+ if (lwp_list != NULL)
+ lwp_list->prev = lp;
+ lwp_list = lp;
+}
+
+/* Remove LP from sorted-by-creation-order double-linked list. */
+
+static void
+lwp_list_remove (struct lwp_info *lp)
+{
+ /* Remove from sorted-by-creation-order list. */
+ if (lp->next != NULL)
+ lp->next->prev = lp->prev;
+ if (lp->prev != NULL)
+ lp->prev->next = lp->next;
+ if (lp == lwp_list)
+ lwp_list = lp->next;
+}
+
/* Original signal mask. */
@@ -754,31 +831,30 @@ lwp_free (struct lwp_info *lp)
xfree (lp);
}
-/* Remove all LWPs belong to PID from the lwp list. */
+/* Traversal function for purge_lwp_list. */
-static void
-purge_lwp_list (int pid)
+static int
+lwp_lwpid_htab_remove_pid (void **slot, void *info)
{
- struct lwp_info *lp, *lpprev, *lpnext;
-
- lpprev = NULL;
+ struct lwp_info *lp = (struct lwp_info *) *slot;
+ int pid = *(int *) info;
- for (lp = lwp_list; lp; lp = lpnext)
+ if (ptid_get_pid (lp->ptid) == pid)
{
- lpnext = lp->next;
+ htab_clear_slot (lwp_lwpid_htab, slot);
+ lwp_list_remove (lp);
+ lwp_free (lp);
+ }
- if (ptid_get_pid (lp->ptid) == pid)
- {
- if (lp == lwp_list)
- lwp_list = lp->next;
- else
- lpprev->next = lp->next;
+ return 1;
+}
- lwp_free (lp);
- }
- else
- lpprev = lp;
- }
+/* Remove all LWPs belong to PID from the lwp list. */
+
+static void
+purge_lwp_list (int pid)
+{
+ htab_traverse_noresize (lwp_lwpid_htab, lwp_lwpid_htab_remove_pid, &pid);
}
/* Add the LWP specified by PTID to the list. PTID is the first LWP
@@ -812,8 +888,11 @@ add_initial_lwp (ptid_t ptid)
lp->ptid = ptid;
lp->core = -1;
- lp->next = lwp_list;
- lwp_list = lp;
+ /* Add to sorted-by-creation-order list. */
+ lwp_list_add (lp);
+
+ /* Add to keyed-by-pid htab. */
+ lwp_lwpid_htab_add_lwp (lp);
return lp;
}
@@ -844,22 +923,24 @@ add_lwp (ptid_t ptid)
static void
delete_lwp (ptid_t ptid)
{
- struct lwp_info *lp, *lpprev;
+ struct lwp_info *lp;
+ void **slot;
+ struct lwp_info dummy;
- lpprev = NULL;
+ dummy.ptid = ptid;
+ slot = htab_find_slot (lwp_lwpid_htab, &dummy, NO_INSERT);
+ if (slot == NULL)
+ return;
- for (lp = lwp_list; lp; lpprev = lp, lp = lp->next)
- if (ptid_equal (lp->ptid, ptid))
- break;
+ lp = *(struct lwp_info **) slot;
+ gdb_assert (lp != NULL);
- if (!lp)
- return;
+ htab_clear_slot (lwp_lwpid_htab, slot);
- if (lpprev)
- lpprev->next = lp->next;
- else
- lwp_list = lp->next;
+ /* Remove from sorted-by-creation-order list. */
+ lwp_list_remove (lp);
+ /* Release. */
lwp_free (lp);
}
@@ -871,17 +952,16 @@ find_lwp_pid (ptid_t ptid)
{
struct lwp_info *lp;
int lwp;
+ struct lwp_info dummy;
if (ptid_lwp_p (ptid))
lwp = ptid_get_lwp (ptid);
else
lwp = ptid_get_pid (ptid);
- for (lp = lwp_list; lp; lp = lp->next)
- if (lwp == ptid_get_lwp (lp->ptid))
- return lp;
-
- return NULL;
+ dummy.ptid = ptid_build (0, lwp, 0);
+ lp = (struct lwp_info *) htab_find (lwp_lwpid_htab, &dummy);
+ return lp;
}
/* See nat/linux-nat.h. */
@@ -4874,6 +4954,8 @@ Enables printf debugging output."),
sigdelset (&suspend_mask, SIGCHLD);
sigemptyset (&blocked_mask);
+
+ lwp_lwpid_htab_create ();
}
@@ -101,7 +101,9 @@ struct lwp_info
/* Arch-specific additions. */
struct arch_lwp_info *arch_private;
- /* Next LWP in list. */
+ /* Previous and next pointers in double-linked list of known LWPs,
+ sorted by reverse creation order. */
+ struct lwp_info *prev;
struct lwp_info *next;
};