linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v2] do_wait: make PIDTYPE_PID case O(1) instead of O(n)
@ 2021-03-09 16:15 Jim Newsome
  2021-03-09 17:15 ` Oleg Nesterov
  0 siblings, 1 reply; 3+ messages in thread
From: Jim Newsome @ 2021-03-09 16:15 UTC (permalink / raw)
  To: Andrew Morton
  Cc: Oleg Nesterov, Eric W . Biederman, Christian Brauner,
	linux-kernel, Jim Newsome

do_wait is an internal function used to implement waitpid, waitid,
wait4, etc. To handle the general case, it does an O(n) linear scan of
the thread group's children and tracees.

This patch adds a special-case when waiting on a pid to skip these scans
and instead do an O(1) lookup. This improves performance when waiting on
a pid from a thread group with many children and/or tracees.

Signed-off-by: James Newsome
---
 kernel/exit.c | 54 ++++++++++++++++++++++++++++++++++++++++++---------
 1 file changed, 45 insertions(+), 9 deletions(-)

diff --git a/kernel/exit.c b/kernel/exit.c
index 04029e35e69a..312c4dfc9555 100644
--- a/kernel/exit.c
+++ b/kernel/exit.c
@@ -1439,6 +1439,33 @@ void __wake_up_parent(struct task_struct *p, struct task_struct *parent)
 			   TASK_INTERRUPTIBLE, p);
 }
 
+// Optimization for waiting on PIDTYPE_PID. No need to iterate through child
+// and tracee lists to find the target task.
+static int do_wait_pid(struct wait_opts *wo, struct task_struct *tsk)
+{
+	struct task_struct *target = pid_task(wo->wo_pid, PIDTYPE_PID);
+	if (!target) {
+		return 0;
+	}
+	if (tsk == target->real_parent ||
+	    (!(wo->wo_flags & __WNOTHREAD) &&
+	     same_thread_group(tsk, target->real_parent))) {
+		int retval = wait_consider_task(wo, /* ptrace= */ 0, target);
+		if (retval) {
+			return retval;
+		}
+	}
+	if (target->ptrace && (tsk == target->parent ||
+			       (!(wo->wo_flags & __WNOTHREAD) &&
+				same_thread_group(tsk, target->parent)))) {
+		int retval = wait_consider_task(wo, /* ptrace= */ 1, target);
+		if (retval) {
+			return retval;
+		}
+	}
+	return 0;
+}
+
 static long do_wait(struct wait_opts *wo)
 {
 	struct task_struct *tsk;
@@ -1464,18 +1491,27 @@ static long do_wait(struct wait_opts *wo)
 	set_current_state(TASK_INTERRUPTIBLE);
 	read_lock(&tasklist_lock);
 	tsk = current;
-	do {
-		retval = do_wait_thread(wo, tsk);
-		if (retval)
-			goto end;
 
-		retval = ptrace_do_wait(wo, tsk);
-		if (retval)
+	if (wo->wo_type == PIDTYPE_PID) {
+		retval = do_wait_pid(wo, tsk);
+		if (retval) {
 			goto end;
+		}
+	} else {
+		do {
+			retval = do_wait_thread(wo, tsk);
+			if (retval)
+				goto end;
 
-		if (wo->wo_flags & __WNOTHREAD)
-			break;
-	} while_each_thread(current, tsk);
+			retval = ptrace_do_wait(wo, tsk);
+			if (retval)
+				goto end;
+
+			if (wo->wo_flags & __WNOTHREAD)
+				break;
+		}
+		while_each_thread(current, tsk);
+	}
 	read_unlock(&tasklist_lock);
 
 notask:
-- 
2.30.1


^ permalink raw reply related	[flat|nested] 3+ messages in thread

* Re: [PATCH v2] do_wait: make PIDTYPE_PID case O(1) instead of O(n)
  2021-03-09 16:15 [PATCH v2] do_wait: make PIDTYPE_PID case O(1) instead of O(n) Jim Newsome
@ 2021-03-09 17:15 ` Oleg Nesterov
  2021-03-09 20:42   ` Jim Newsome
  0 siblings, 1 reply; 3+ messages in thread
From: Oleg Nesterov @ 2021-03-09 17:15 UTC (permalink / raw)
  To: Jim Newsome
  Cc: Andrew Morton, Eric W . Biederman, Christian Brauner, linux-kernel

Jim,

Thanks, the patch looks good to me. Yet I think you need to send V3 even
if I personally do not care ;) Please consider ./scripts/checkpatch.pl,
it reports all the coding-style problems I was going to mention.

I too have a couple of cosmetic nits, but feel free to ignore, this is
subjective.

On 03/09, Jim Newsome wrote:
>
> do_wait is an internal function used to implement waitpid, waitid,
> wait4, etc. To handle the general case, it does an O(n) linear scan of
> the thread group's children and tracees.
> 
> This patch adds a special-case when waiting on a pid to skip these scans
> and instead do an O(1) lookup. This improves performance when waiting on
> a pid from a thread group with many children and/or tracees.
> 
> Signed-off-by: James Newsome
> ---
>  kernel/exit.c | 54 ++++++++++++++++++++++++++++++++++++++++++---------
>  1 file changed, 45 insertions(+), 9 deletions(-)
> 
> diff --git a/kernel/exit.c b/kernel/exit.c
> index 04029e35e69a..312c4dfc9555 100644
> --- a/kernel/exit.c
> +++ b/kernel/exit.c
> @@ -1439,6 +1439,33 @@ void __wake_up_parent(struct task_struct *p, struct task_struct *parent)
>  			   TASK_INTERRUPTIBLE, p);
>  }
>  
> +// Optimization for waiting on PIDTYPE_PID. No need to iterate through child
> +// and tracee lists to find the target task.
> +static int do_wait_pid(struct wait_opts *wo, struct task_struct *tsk)
> +{
> +	struct task_struct *target = pid_task(wo->wo_pid, PIDTYPE_PID);
> +	if (!target) {
> +		return 0;
> +	}
> +	if (tsk == target->real_parent ||
> +	    (!(wo->wo_flags & __WNOTHREAD) &&
> +	     same_thread_group(tsk, target->real_parent))) {
> +		int retval = wait_consider_task(wo, /* ptrace= */ 0, target);
> +		if (retval) {
> +			return retval;
> +		}
> +	}
> +	if (target->ptrace && (tsk == target->parent ||
> +			       (!(wo->wo_flags & __WNOTHREAD) &&
> +				same_thread_group(tsk, target->parent)))) {
> +		int retval = wait_consider_task(wo, /* ptrace= */ 1, target);
> +		if (retval) {
> +			return retval;
> +		}
> +	}

Both if's use "int retval", to me it would be better to declare this variable
at the start of do_wait_pid(). But again, I won't insist this is up to you.

I am wondering if something like

	static inline bool is_parent(struct task_struct *tsk, struct task_struct *p, int flags)
	{
		return	tsk == p || !(flags & __WNOTHREAD)) && same_thread_group(tsk, p);
	}

makes any sense to make do_wait_pid() more clear... probably not.

Oleg.


^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: [PATCH v2] do_wait: make PIDTYPE_PID case O(1) instead of O(n)
  2021-03-09 17:15 ` Oleg Nesterov
@ 2021-03-09 20:42   ` Jim Newsome
  0 siblings, 0 replies; 3+ messages in thread
From: Jim Newsome @ 2021-03-09 20:42 UTC (permalink / raw)
  To: Oleg Nesterov
  Cc: Andrew Morton, Eric W . Biederman, Christian Brauner, linux-kernel

On 3/9/21 11:15, Oleg Nesterov wrote:
> Jim,
>
> Thanks, the patch looks good to me. Yet I think you need to send V3 even
> if I personally do not care ;) Please consider ./scripts/checkpatch.pl,
> it reports all the coding-style problems I was going to mention.

Thanks! I'd thought clang-format with the included configuration would
be sufficient, but apparently not :)

> Both if's use "int retval", to me it would be better to declare this variable
> at the start of do_wait_pid(). But again, I won't insist this is up to you.
My usual inclination is to avoid uninitialized variables and prefer
putting them in tighter scopes. I don't think it's really much of an
issue in this relatively short function though; happy to go with the
prevailing style.
> I am wondering if something like
>
> 	static inline bool is_parent(struct task_struct *tsk, struct task_struct *p, int flags)
> 	{
> 		return	tsk == p || !(flags & __WNOTHREAD)) && same_thread_group(tsk, p);
> 	}
>
> makes any sense to make do_wait_pid() more clear... probably not.

Yeah, I lean slightly towards the extra level of indirection not being
worth the deduplication.

I made a couple other small changes as well:

* No need for do_wait_pid to take the parameter `tsk` since it's only
ever called with `current`

* With that change, the declaration of `tsk` in `do_wait` can be moved
into a tighter scope of where it's used in the loop.

v3: https://lkml.org/lkml/2021/3/9/1134


^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2021-03-09 20:43 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-03-09 16:15 [PATCH v2] do_wait: make PIDTYPE_PID case O(1) instead of O(n) Jim Newsome
2021-03-09 17:15 ` Oleg Nesterov
2021-03-09 20:42   ` Jim Newsome

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).