[PATCH RFC fs/namespace] Make kern_unmount() use synchronize_rcu

linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed

* [PATCH RFC fs/namespace] Make kern_unmount() use synchronize_rcu_expedited()
@ 2022-02-14 19:05 Paul E. McKenney
  2022-02-14 19:26 ` Chris Mason
  0 siblings, 1 reply; 12+ messages in thread
From: Paul E. McKenney @ 2022-02-14 19:05 UTC (permalink / raw)
  To: clm; +Cc: riel, viro, linux-kernel, linux-fsdevel, kernel-team

Experimental.  Not for inclusion.  Yet, anyway.

Freeing large numbers of namespaces in quick succession can result in
a bottleneck on the synchronize_rcu() invoked from kern_unmount().
This patch applies the synchronize_rcu_expedited() hammer to allow
further testing and fault isolation.

Hey, at least there was no need to change the comment!  ;-)

Cc: Alexander Viro <viro@zeniv.linux.org.uk>
Cc: <linux-fsdevel@vger.kernel.org>
Cc: <linux-kernel@vger.kernel.org>
Not-yet-signed-off-by: Paul E. McKenney <paulmck@kernel.org>

---

 namespace.c |    2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/fs/namespace.c b/fs/namespace.c
index 40b994a29e90d..79c50ad0ade5b 100644
--- a/fs/namespace.c
+++ b/fs/namespace.c
@@ -4389,7 +4389,7 @@ void kern_unmount(struct vfsmount *mnt)
 	/* release long term mount so mount point can be released */
 	if (!IS_ERR_OR_NULL(mnt)) {
 		real_mount(mnt)->mnt_ns = NULL;
-		synchronize_rcu();	/* yecchhh... */
+		synchronize_rcu_expedited();	/* yecchhh... */
 		mntput(mnt);
 	}
 }

^ permalink raw reply related	[flat|nested] 12+ messages in thread

* Re: [PATCH RFC fs/namespace] Make kern_unmount() use synchronize_rcu_expedited()
  2022-02-14 19:05 [PATCH RFC fs/namespace] Make kern_unmount() use synchronize_rcu_expedited() Paul E. McKenney
@ 2022-02-14 19:26 ` Chris Mason
  2022-02-14 19:44   ` Paul E. McKenney
  2022-02-15 18:28   ` Chris Mason
  0 siblings, 2 replies; 12+ messages in thread
From: Chris Mason @ 2022-02-14 19:26 UTC (permalink / raw)
  To: Paul E. McKenney, Giuseppe Scrivano
  Cc: riel, viro, linux-kernel, linux-fsdevel, Kernel Team

> On Feb 14, 2022, at 2:05 PM, Paul E. McKenney <paulmck@kernel.org> wrote:
> 
> Experimental.  Not for inclusion.  Yet, anyway.
> 
> Freeing large numbers of namespaces in quick succession can result in
> a bottleneck on the synchronize_rcu() invoked from kern_unmount().
> This patch applies the synchronize_rcu_expedited() hammer to allow
> further testing and fault isolation.
> 
> Hey, at least there was no need to change the comment!  ;-)
> 

I don’t think this will be fast enough.  I think the problem is that commit e1eb26fa62d04ec0955432be1aa8722a97cb52e7 is putting all of the ipc namespace frees onto a list, and every free includes one call to synchronize_rcu()

The end result is that we can create new namespaces much much faster than we can free them, and eventually we run out.  I found this while debugging clone() returning ENOSPC because create_ipc_ns() was returning ENOSPC.

-chris

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH RFC fs/namespace] Make kern_unmount() use synchronize_rcu_expedited()
  2022-02-14 19:26 ` Chris Mason
@ 2022-02-14 19:44   ` Paul E. McKenney
  2022-02-14 20:55     ` Rik van Riel
  2022-02-15 18:28   ` Chris Mason
  1 sibling, 1 reply; 12+ messages in thread
From: Paul E. McKenney @ 2022-02-14 19:44 UTC (permalink / raw)
  To: Chris Mason
  Cc: Giuseppe Scrivano, riel, viro, linux-kernel, linux-fsdevel, Kernel Team

On Mon, Feb 14, 2022 at 07:26:49PM +0000, Chris Mason wrote:
> 
> 
> > On Feb 14, 2022, at 2:05 PM, Paul E. McKenney <paulmck@kernel.org> wrote:
> > 
> > Experimental.  Not for inclusion.  Yet, anyway.
> > 
> > Freeing large numbers of namespaces in quick succession can result in
> > a bottleneck on the synchronize_rcu() invoked from kern_unmount().
> > This patch applies the synchronize_rcu_expedited() hammer to allow
> > further testing and fault isolation.
> > 
> > Hey, at least there was no need to change the comment!  ;-)
> > 
> 
> I don’t think this will be fast enough.  I think the problem is that commit e1eb26fa62d04ec0955432be1aa8722a97cb52e7 is putting all of the ipc namespace frees onto a list, and every free includes one call to synchronize_rcu()
> 
> The end result is that we can create new namespaces much much faster than we can free them, and eventually we run out.  I found this while debugging clone() returning ENOSPC because create_ipc_ns() was returning ENOSPC.

Moving from synchronize_rcu() to synchronize_rcu_expedited() does buy
you at least an order of magnitude.  But yes, it should be possible to
get rid of all but one call per batch, which would be better.  Maybe
a bit more complicated, but probably not that much.

Let me see what I can come up with.

If this is an emergency, I still suggest trying the patch as a short-term
workaround.

							Thanx, Paul

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH RFC fs/namespace] Make kern_unmount() use synchronize_rcu_expedited()
  2022-02-14 19:44   ` Paul E. McKenney
@ 2022-02-14 20:55     ` Rik van Riel
  2022-02-14 21:41       ` Paul E. McKenney
  0 siblings, 1 reply; 12+ messages in thread
From: Rik van Riel @ 2022-02-14 20:55 UTC (permalink / raw)
  To: Paul E. McKenney
  Cc: Chris Mason, Giuseppe Scrivano, viro, linux-kernel,
	linux-fsdevel, Kernel Team

On Mon, 14 Feb 2022 11:44:40 -0800
"Paul E. McKenney" <paulmck@kernel.org> wrote:
> On Mon, Feb 14, 2022 at 07:26:49PM +0000, Chris Mason wrote:

> Moving from synchronize_rcu() to synchronize_rcu_expedited() does buy
> you at least an order of magnitude.  But yes, it should be possible to
> get rid of all but one call per batch, which would be better.  Maybe
> a bit more complicated, but probably not that much.

It doesn't look too bad, except for the include of ../fs/mount.h.

I'm hoping somebody has a better idea on how to deal with that.
Do we need a kern_unmount() variant that doesn't do the RCU wait,
or should it get a parameter, or something else?

Is there an ordering requirement between the synchronize_rcu call
and zeroing out n->mq_mnt->mnt_ls?

What other changes do we need to make everything right?

The change below also fixes the issue that to-be-freed items that
are queued up while the free_ipc work function runs do not result
in the work item being enqueued again.

This patch is still totally untested because the 4 year old is
at home today :)


diff --git a/ipc/namespace.c b/ipc/namespace.c
index 7bd0766ddc3b..321cbda17cfb 100644
--- a/ipc/namespace.c
+++ b/ipc/namespace.c
@@ -17,6 +17,7 @@
 #include <linux/proc_ns.h>
 #include <linux/sched/task.h>
 
+#include "../fs/mount.h"
 #include "util.h"
 
 static struct ucounts *inc_ipc_namespaces(struct user_namespace *ns)
@@ -117,10 +118,7 @@ void free_ipcs(struct ipc_namespace *ns, struct ipc_ids *ids,
 
 static void free_ipc_ns(struct ipc_namespace *ns)
 {
-	/* mq_put_mnt() waits for a grace period as kern_unmount()
-	 * uses synchronize_rcu().
-	 */
-	mq_put_mnt(ns);
+	mntput(ns->mq_mnt);
 	sem_exit_ns(ns);
 	msg_exit_ns(ns);
 	shm_exit_ns(ns);
@@ -134,11 +132,19 @@ static void free_ipc_ns(struct ipc_namespace *ns)
 static LLIST_HEAD(free_ipc_list);
 static void free_ipc(struct work_struct *unused)
 {
-	struct llist_node *node = llist_del_all(&free_ipc_list);
+	struct llist_node *node;
 	struct ipc_namespace *n, *t;
 
-	llist_for_each_entry_safe(n, t, node, mnt_llist)
-		free_ipc_ns(n);
+	while ((node = llist_del_all(&free_ipc_list))) {
+		llist_for_each_entry(n, node, mnt_llist)
+			real_mount(n->mq_mnt)->mnt_ns = NULL;
+
+		/* Wait for the last users to have gone away. */
+		synchronize_rcu();
+
+		llist_for_each_entry_safe(n, t, node, mnt_llist)
+			free_ipc_ns(n);
+	}
 }
 
 /*


^ permalink raw reply related	[flat|nested] 12+ messages in thread

* Re: [PATCH RFC fs/namespace] Make kern_unmount() use synchronize_rcu_expedited()
  2022-02-14 20:55     ` Rik van Riel
@ 2022-02-14 21:41       ` Paul E. McKenney
  0 siblings, 0 replies; 12+ messages in thread
From: Paul E. McKenney @ 2022-02-14 21:41 UTC (permalink / raw)
  To: Rik van Riel
  Cc: Chris Mason, Giuseppe Scrivano, viro, linux-kernel,
	linux-fsdevel, Kernel Team

On Mon, Feb 14, 2022 at 03:55:36PM -0500, Rik van Riel wrote:
> On Mon, 14 Feb 2022 11:44:40 -0800
> "Paul E. McKenney" <paulmck@kernel.org> wrote:
> > On Mon, Feb 14, 2022 at 07:26:49PM +0000, Chris Mason wrote:
> 
> > Moving from synchronize_rcu() to synchronize_rcu_expedited() does buy
> > you at least an order of magnitude.  But yes, it should be possible to
> > get rid of all but one call per batch, which would be better.  Maybe
> > a bit more complicated, but probably not that much.
> 
> It doesn't look too bad, except for the include of ../fs/mount.h.
> 
> I'm hoping somebody has a better idea on how to deal with that.
> Do we need a kern_unmount() variant that doesn't do the RCU wait,
> or should it get a parameter, or something else?
> 
> Is there an ordering requirement between the synchronize_rcu call
> and zeroing out n->mq_mnt->mnt_ls?
> 
> What other changes do we need to make everything right?
> 
> The change below also fixes the issue that to-be-freed items that
> are queued up while the free_ipc work function runs do not result
> in the work item being enqueued again.
> 
> This patch is still totally untested because the 4 year old is
> at home today :)

I agree that this should decrease grace-period latency quite a bit
more than does my patch, so here is hoping that it does work.  ;-)

							Thanx, Paul

> diff --git a/ipc/namespace.c b/ipc/namespace.c
> index 7bd0766ddc3b..321cbda17cfb 100644
> --- a/ipc/namespace.c
> +++ b/ipc/namespace.c
> @@ -17,6 +17,7 @@
>  #include <linux/proc_ns.h>
>  #include <linux/sched/task.h>
>  
> +#include "../fs/mount.h"
>  #include "util.h"
>  
>  static struct ucounts *inc_ipc_namespaces(struct user_namespace *ns)
> @@ -117,10 +118,7 @@ void free_ipcs(struct ipc_namespace *ns, struct ipc_ids *ids,
>  
>  static void free_ipc_ns(struct ipc_namespace *ns)
>  {
> -	/* mq_put_mnt() waits for a grace period as kern_unmount()
> -	 * uses synchronize_rcu().
> -	 */
> -	mq_put_mnt(ns);
> +	mntput(ns->mq_mnt);
>  	sem_exit_ns(ns);
>  	msg_exit_ns(ns);
>  	shm_exit_ns(ns);
> @@ -134,11 +132,19 @@ static void free_ipc_ns(struct ipc_namespace *ns)
>  static LLIST_HEAD(free_ipc_list);
>  static void free_ipc(struct work_struct *unused)
>  {
> -	struct llist_node *node = llist_del_all(&free_ipc_list);
> +	struct llist_node *node;
>  	struct ipc_namespace *n, *t;
>  
> -	llist_for_each_entry_safe(n, t, node, mnt_llist)
> -		free_ipc_ns(n);
> +	while ((node = llist_del_all(&free_ipc_list))) {
> +		llist_for_each_entry(n, node, mnt_llist)
> +			real_mount(n->mq_mnt)->mnt_ns = NULL;
> +
> +		/* Wait for the last users to have gone away. */
> +		synchronize_rcu();
> +
> +		llist_for_each_entry_safe(n, t, node, mnt_llist)
> +			free_ipc_ns(n);
> +	}
>  }
>  
>  /*
> 

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH RFC fs/namespace] Make kern_unmount() use synchronize_rcu_expedited()
  2022-02-14 19:26 ` Chris Mason
  2022-02-14 19:44   ` Paul E. McKenney
@ 2022-02-15 18:28   ` Chris Mason
  2022-04-26  6:59     ` Christoph Bartoschek
  1 sibling, 1 reply; 12+ messages in thread
From: Chris Mason @ 2022-02-15 18:28 UTC (permalink / raw)
  To: Paul E. McKenney, Giuseppe Scrivano
  Cc: riel, viro, linux-kernel, linux-fsdevel, Kernel Team


> On Feb 14, 2022, at 2:26 PM, Chris Mason <clm@fb.com> wrote:
> 
> 
> 
>> On Feb 14, 2022, at 2:05 PM, Paul E. McKenney <paulmck@kernel.org> wrote:
>> 
>> Experimental.  Not for inclusion.  Yet, anyway.
>> 
>> Freeing large numbers of namespaces in quick succession can result in
>> a bottleneck on the synchronize_rcu() invoked from kern_unmount().
>> This patch applies the synchronize_rcu_expedited() hammer to allow
>> further testing and fault isolation.
>> 
>> Hey, at least there was no need to change the comment!  ;-)
>> 
> 
> I don’t think this will be fast enough.  I think the problem is that commit e1eb26fa62d04ec0955432be1aa8722a97cb52e7 is putting all of the ipc namespace frees onto a list, and every free includes one call to synchronize_rcu()
> 
> The end result is that we can create new namespaces much much faster than we can free them, and eventually we run out.  I found this while debugging clone() returning ENOSPC because create_ipc_ns() was returning ENOSPC.

I’m going to try Rik’s patch, but I changed Giuseppe’s benchmark from this commit, just to make it run for a million iterations instead of 1000.

#define _GNU_SOURCE
#include <sched.h>
#include <error.h>
#include <errno.h>
#include <stdlib.h>
#include <stdio.h>

int main()
{
        int i;

        for (i = 0; i < 1000000; i++) {
                if (unshare(CLONE_NEWIPC) < 0)
                        error(EXIT_FAILURE, errno, "unshare");
        }
}

Then I put on a drgn script to print the size of the free_ipc_list:

#!/usr/bin/env drgn
# usage: ./check_list <pid of worker thread doing free_ipc calls>

from drgn import *
from drgn.helpers.linux.pid import find_task
import sys,os,time

def llist_count(cur):
    count = 0
    while cur:
        count += 1
        cur = cur.next
    return count

pid = int(sys.argv[1])

# sometimes the worker is in different functions, so this
# will throw exceptions if we can't find the free_ipc call
for x in range(1, 5):
    try:
        task = find_task(prog, int(pid))
        trace = prog.stack_trace(task)
        head = prog['free_ipc_list']
        for i in range(0, len(trace)):
            if "free_ipc at" in str(trace[i]):
                free_ipc_index = i
        n = trace[free_ipc_index]['n']
        print("ipc free list is %d worker %d remaining %d" % (llist_count(head.first), pid, llist_count(n.mnt_llist.next)))
        break
    except:
        time.sleep(0.5)
        pass

I was expecting the run to pretty quickly hit ENOSPC, then try Rik’s patch, then celebrate and move on.  What seems to be happening instead is that unshare is spending all of its time creating super blocks:

    48.07%  boom             [kernel.vmlinux]                                                 [k] test_keyed_super
            |
            ---0x5541f689495641d7
               __libc_start_main
               unshare
               entry_SYSCALL_64
               do_syscall_64
               __x64_sys_unshare
               ksys_unshare
               unshare_nsproxy_namespaces
               create_new_namespaces
               copy_ipcs
               mq_init_ns
               mq_create_mount
               fc_mount
               vfs_get_tree
               vfs_get_super
               sget_fc
               test_keyed_super

But, this does nicely show the backlog on the free_ipc_list.  It gets up to around 150K entries, with our worker thread stuck:

196 kworker/0:2+events D
[<0>] __wait_rcu_gp+0x105/0x120
[<0>] synchronize_rcu+0x64/0x70
[<0>] kern_unmount+0x27/0x50
[<0>] free_ipc+0x6b/0xe0
[<0>] process_one_work+0x1ee/0x3c0
[<0>] worker_thread+0x23a/0x3b0
[<0>] kthread+0xe6/0x110
[<0>] ret_from_fork+0x1f/0x30

# ./check_list.drgn 196
ipc free list is 58099 worker 196 remaining 98012

Eventually, hlist_for_each_entry(old, &fc->fs_type->fs_supers, s_instances) is slower than synchronize_rcu(), and the worker thread is able to make progress?  Production in this case is a few nsjail procs, so it’s not a crazy workload.  My guess is that prod tends to have longer grace periods than this test box, so the worker thread loses, but I haven’t been able to figure out why the worker suddenly catches up from time to time on the test box.

-chris


^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH RFC fs/namespace] Make kern_unmount() use synchronize_rcu_expedited()
  2022-02-15 18:28   ` Chris Mason
@ 2022-04-26  6:59     ` Christoph Bartoschek
  2022-04-26 14:09       ` Paul E. McKenney
  0 siblings, 1 reply; 12+ messages in thread
From: Christoph Bartoschek @ 2022-04-26  6:59 UTC (permalink / raw)
  To: Chris Mason, Paul E. McKenney, Giuseppe Scrivano
  Cc: linux-kernel, riel, viro, Christoph Bartoschek

The regression that has been introduced with commit
e1eb26fa62d04ec0955432be1aa8722a97cb52e7 has hit us when building with Bazel
using the linux-sandbox
(https://github.com/bazelbuild/bazel/blob/master/src/main/tools/linux-sandbox.cc).
The sandbox tries to isolate build steps from each other and to ensure that
builds are hermetic and therefore sets up new namespaces for each step. For
large software packages and even with the time spend building we run out of
namespaces on larger machines that allow for enough parallelism. I have reduced
the sandbox to a simple test case:

#define _GNU_SOURCE
#include <errno.h>
#include <sched.h>
#include <stdio.h>
#include <stdlib.h>
#include <sys/types.h>
#include <sys/wait.h>

int pid1main(void *) {
   return 0;
}

int main(void) {
  int clone_flags = CLONE_NEWUSER | CLONE_NEWIPC | SIGCHLD;
  void * stack = malloc(1024*1024);
  const pid_t child_pid = clone(pid1main, stack + 1024*1024, clone_flags, NULL);

  if (child_pid < 0) {
    perror("clone");
  }
  int ret = waitpid(child_pid, NULL, 0);
  if (ret < 0) {
    perror("waitpid");
    return ret;
  }
  return 0;
}

Run it with
$ gcc clone-test.cc
$ seq 1 10000000 | parallel --halt now,fail=1 -j32 $PWD/a.out
clone: No space left on device
waitpid: No child processes
parallel: This job failed:
/usr/local/google/home/bartoschek/linux-sandbox-test/a.out 53070

I run the test on kernel v5.18-rc4.
Depending on your configured limits you will soon get an ENOSPC even though
never more than 32 additional namespaces should be in use by parallel.
During execution the whole system can become quite unresponsive.
This does not happen without e1eb26fa62d04ec0955432be1aa8722a97cb52e7.

I see that the issue was already reported in 2020:
http://merlin.infradead.org/pipermail/linux-nvme/2020-September/019565.html

Would it be possible to revert e1eb26fa62d04ec0955432be1aa8722a97cb52e7? It
seems to make the kernel less deterministic and hard to reason about active
namespaces.

Christoph

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH RFC fs/namespace] Make kern_unmount() use synchronize_rcu_expedited()
  2022-04-26  6:59     ` Christoph Bartoschek
@ 2022-04-26 14:09       ` Paul E. McKenney
  2022-04-26 22:58         ` Christoph Bartoschek
  0 siblings, 1 reply; 12+ messages in thread
From: Paul E. McKenney @ 2022-04-26 14:09 UTC (permalink / raw)
  To: Christoph Bartoschek
  Cc: Chris Mason, Giuseppe Scrivano, linux-kernel, riel, viro

On Tue, Apr 26, 2022 at 08:59:17AM +0200, Christoph Bartoschek wrote:
> The regression that has been introduced with commit
> e1eb26fa62d04ec0955432be1aa8722a97cb52e7 has hit us when building with Bazel
> using the linux-sandbox
> (https://github.com/bazelbuild/bazel/blob/master/src/main/tools/linux-sandbox.cc).
> The sandbox tries to isolate build steps from each other and to ensure that
> builds are hermetic and therefore sets up new namespaces for each step. For
> large software packages and even with the time spend building we run out of
> namespaces on larger machines that allow for enough parallelism. I have reduced
> the sandbox to a simple test case:
> 
> #define _GNU_SOURCE
> #include <errno.h>
> #include <sched.h>
> #include <stdio.h>
> #include <stdlib.h>
> #include <sys/types.h>
> #include <sys/wait.h>
> 
> int pid1main(void *) {
>    return 0;
> }
> 
> int main(void) {
>   int clone_flags = CLONE_NEWUSER | CLONE_NEWIPC | SIGCHLD;
>   void * stack = malloc(1024*1024);
>   const pid_t child_pid = clone(pid1main, stack + 1024*1024, clone_flags, NULL);
> 
>   if (child_pid < 0) {
>     perror("clone");
>   }
>   int ret = waitpid(child_pid, NULL, 0);
>   if (ret < 0) {
>     perror("waitpid");
>     return ret;
>   }
>   return 0;
> }
> 
> Run it with
> $ gcc clone-test.cc
> $ seq 1 10000000 | parallel --halt now,fail=1 -j32 $PWD/a.out
> clone: No space left on device
> waitpid: No child processes
> parallel: This job failed:
> /usr/local/google/home/bartoschek/linux-sandbox-test/a.out 53070
> 
> I run the test on kernel v5.18-rc4.
> Depending on your configured limits you will soon get an ENOSPC even though
> never more than 32 additional namespaces should be in use by parallel.
> During execution the whole system can become quite unresponsive.
> This does not happen without e1eb26fa62d04ec0955432be1aa8722a97cb52e7.
> 
> I see that the issue was already reported in 2020:
> http://merlin.infradead.org/pipermail/linux-nvme/2020-September/019565.html
> 
> Would it be possible to revert e1eb26fa62d04ec0955432be1aa8722a97cb52e7? It
> seems to make the kernel less deterministic and hard to reason about active
> namespaces.

There were several attempts to fix this:

1. https://lore.kernel.org/lkml/20220214190549.GA2815154@paulmck-ThinkPad-P17-Gen-1/
	Replace a synchronize_rcu() with synchronize_rcu_expedited()

2. https://lore.kernel.org/lkml/20220217153620.4607bc28@imladris.surriel.com/
	Use queue_rcu_work() and streamline things.

3. https://lore.kernel.org/lkml/20220218183114.2867528-1-riel@surriel.com/
	Refined queue_rcu_work() approach.

#1 should work, but the resulting IPIs are not going to make the real-time
guys happy.  #2 and #3 have been subject to reasonably heavy testing
and did fix a very similar issue to the one that you are reporting,
but last I knew there were doubts about the concurrency consequences.

Could you please give at least #3 a shot and see if it helps you?

							Thanx, Paul

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH RFC fs/namespace] Make kern_unmount() use synchronize_rcu_expedited()
  2022-04-26 14:09       ` Paul E. McKenney
@ 2022-04-26 22:58         ` Christoph Bartoschek
  2022-04-26 23:11           ` Paul E. McKenney
  0 siblings, 1 reply; 12+ messages in thread
From: Christoph Bartoschek @ 2022-04-26 22:58 UTC (permalink / raw)
  To: paulmck; +Cc: Chris Mason, Giuseppe Scrivano, linux-kernel, riel, viro

> 3. https://lore.kernel.org/lkml/20220218183114.2867528-1-riel@surriel.com/
>         Refined queue_rcu_work() approach.
>
> #1 should work, but the resulting IPIs are not going to make the real-time
> guys happy.  #2 and #3 have been subject to reasonably heavy testing
> and did fix a very similar issue to the one that you are reporting,
> but last I knew there were doubts about the concurrency consequences.
>
> Could you please give at least #3 a shot and see if it helps you?
>

I have tried #3 and it works well with my testcases as far as I can see it.

Christoph

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH RFC fs/namespace] Make kern_unmount() use synchronize_rcu_expedited()
  2022-04-26 22:58         ` Christoph Bartoschek
@ 2022-04-26 23:11           ` Paul E. McKenney
  2022-04-27 12:19             ` Chris Mason
  0 siblings, 1 reply; 12+ messages in thread
From: Paul E. McKenney @ 2022-04-26 23:11 UTC (permalink / raw)
  To: Christoph Bartoschek
  Cc: Chris Mason, Giuseppe Scrivano, linux-kernel, riel, viro

On Wed, Apr 27, 2022 at 12:58:34AM +0200, Christoph Bartoschek wrote:
> > 3. https://lore.kernel.org/lkml/20220218183114.2867528-1-riel@surriel.com/
> >         Refined queue_rcu_work() approach.
> >
> > #1 should work, but the resulting IPIs are not going to make the real-time
> > guys happy.  #2 and #3 have been subject to reasonably heavy testing
> > and did fix a very similar issue to the one that you are reporting,
> > but last I knew there were doubts about the concurrency consequences.
> >
> > Could you please give at least #3 a shot and see if it helps you?
> 
> I have tried #3 and it works well with my testcases as far as I can see it.

Thank you for giving it a try!

Al, are further adjustments needed to make this patch cover all the
corner cases?

							Thanx, Paul

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH RFC fs/namespace] Make kern_unmount() use synchronize_rcu_expedited()
  2022-04-26 23:11           ` Paul E. McKenney
@ 2022-04-27 12:19             ` Chris Mason
  2022-04-27 15:01               ` Paul E. McKenney
  0 siblings, 1 reply; 12+ messages in thread
From: Chris Mason @ 2022-04-27 12:19 UTC (permalink / raw)
  To: Paul E. McKenney
  Cc: Christoph Bartoschek, Giuseppe Scrivano, linux-kernel, riel, viro


> On Apr 26, 2022, at 7:11 PM, Paul E. McKenney <paulmck@kernel.org> wrote:
> 
> On Wed, Apr 27, 2022 at 12:58:34AM +0200, Christoph Bartoschek wrote:
>>> 3. https://lore.kernel.org/lkml/20220218183114.2867528-1-riel@surriel.com/
>>> Refined queue_rcu_work() approach.
>>> 
>>> #1 should work, but the resulting IPIs are not going to make the real-time
>>> guys happy. #2 and #3 have been subject to reasonably heavy testing
>>> and did fix a very similar issue to the one that you are reporting,
>>> but last I knew there were doubts about the concurrency consequences.
>>> 
>>> Could you please give at least #3 a shot and see if it helps you?
>> 
>> I have tried #3 and it works well with my testcases as far as I can see it.
> 
> Thank you for giving it a try!
> 
> Al, are further adjustments needed to make this patch cover all the
> corner cases?


Did we end up addressing all of Al’s comments here?

https://lore.kernel.org/lkml/YhCFKyVMtOSyBDJh@zeniv-ca.linux.org.uk/

-chris


^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH RFC fs/namespace] Make kern_unmount() use synchronize_rcu_expedited()
  2022-04-27 12:19             ` Chris Mason
@ 2022-04-27 15:01               ` Paul E. McKenney
  0 siblings, 0 replies; 12+ messages in thread
From: Paul E. McKenney @ 2022-04-27 15:01 UTC (permalink / raw)
  To: Chris Mason
  Cc: Christoph Bartoschek, Giuseppe Scrivano, linux-kernel, riel, viro

On Wed, Apr 27, 2022 at 12:19:56PM +0000, Chris Mason wrote:
> 
> > On Apr 26, 2022, at 7:11 PM, Paul E. McKenney <paulmck@kernel.org> wrote:
> > 
> > On Wed, Apr 27, 2022 at 12:58:34AM +0200, Christoph Bartoschek wrote:
> >>> 3. https://lore.kernel.org/lkml/20220218183114.2867528-1-riel@surriel.com/
> >>> Refined queue_rcu_work() approach.
> >>> 
> >>> #1 should work, but the resulting IPIs are not going to make the real-time
> >>> guys happy. #2 and #3 have been subject to reasonably heavy testing
> >>> and did fix a very similar issue to the one that you are reporting,
> >>> but last I knew there were doubts about the concurrency consequences.
> >>> 
> >>> Could you please give at least #3 a shot and see if it helps you?
> >> 
> >> I have tried #3 and it works well with my testcases as far as I can see it.
> > 
> > Thank you for giving it a try!
> > 
> > Al, are further adjustments needed to make this patch cover all the
> > corner cases?
> 
> Did we end up addressing all of Al’s comments here?
> 
> https://lore.kernel.org/lkml/YhCFKyVMtOSyBDJh@zeniv-ca.linux.org.uk/

Indeed we have not!  Thank you for the reminder, and I will get with
Rik to look into this.

							Thanx, Paul

^ permalink raw reply	[flat|nested] 12+ messages in thread

end of thread, other threads:[~2022-04-27 15:01 UTC | newest]

Thread overview: 12+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2022-02-14 19:05 [PATCH RFC fs/namespace] Make kern_unmount() use synchronize_rcu_expedited() Paul E. McKenney
2022-02-14 19:26 ` Chris Mason
2022-02-14 19:44   ` Paul E. McKenney
2022-02-14 20:55     ` Rik van Riel
2022-02-14 21:41       ` Paul E. McKenney
2022-02-15 18:28   ` Chris Mason
2022-04-26  6:59     ` Christoph Bartoschek
2022-04-26 14:09       ` Paul E. McKenney
2022-04-26 22:58         ` Christoph Bartoschek
2022-04-26 23:11           ` Paul E. McKenney
2022-04-27 12:19             ` Chris Mason
2022-04-27 15:01               ` Paul E. McKenney

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).