All of lore.kernel.org
 help / color / mirror / Atom feed
From: "Paul E. McKenney" <paulmck@kernel.org>
To: Rik van Riel <riel@surriel.com>
Cc: linux-kernel@vger.kernel.org, kernel-team@fb.com,
	Chris Mason <clm@fb.com>,
	linux-fsdevel@vger.kernel.org,
	Giuseppe Scrivano <gscrivan@redhat.com>,
	Al Viro <viro@zeniv.linux.org.uk>
Subject: Re: [PATCH][RFC] ipc,fs: use rcu_work to free struct ipc_namespace
Date: Thu, 17 Feb 2022 13:55:31 -0800	[thread overview]
Message-ID: <20220217215531.GT4285@paulmck-ThinkPad-P17-Gen-1> (raw)
In-Reply-To: <20220217153620.4607bc28@imladris.surriel.com>

On Thu, Feb 17, 2022 at 03:36:20PM -0500, Rik van Riel wrote:
> The patch works, but a cleanup question for Al Viro:
> 
> How do we get rid of #include "../fs/mount.h" and the raw ->mnt_ns = NULL thing
> in the cleanest way?
> 
> ---8<---
> Currently freeing ipc_namespace structures is done through a
> workqueue, with every single item on the queue waiting in
> synchronize_rcu before it is freed, limiting the rate at which
> ipc_namespace structures can be freed to something on the order
> of 100 a second.
> 
> Getting rid of that workqueue and just using rcu_work instead
> allows a whole batch of ipc_namespace frees to wait one single
> RCU grace period, after which they can all get freed quickly.
> 
> Without this patch, a test program that simply calls
> unshare(CLONE_NEWIPC) a million times in a loop eventually
> gets -ENOSPC as the total number of ipc_namespace structures
> exceeds the limit, due to slow freeing.
> 
> With this patch, the test program runs successfully every time.
> 
> Reported-by: Chris Mason <clm@fb.com>
> Signed-off-by: Rik van Riel <riel@surriel.com>

From an RCU perspective, with the ever-dangerous presumption that the
prior code was functionally correct:

Reviewed-by: Paul E. McKenney <paulmck@kernel.org>

> ---
>  include/linux/ipc_namespace.h |  2 +-
>  ipc/namespace.c               | 30 ++++++++----------------------
>  2 files changed, 9 insertions(+), 23 deletions(-)
> 
> diff --git a/include/linux/ipc_namespace.h b/include/linux/ipc_namespace.h
> index b75395ec8d52..ee26fdbb2ce4 100644
> --- a/include/linux/ipc_namespace.h
> +++ b/include/linux/ipc_namespace.h
> @@ -67,7 +67,7 @@ struct ipc_namespace {
>  	struct user_namespace *user_ns;
>  	struct ucounts *ucounts;
>  
> -	struct llist_node mnt_llist;
> +	struct rcu_work free_rwork;
>  
>  	struct ns_common ns;
>  } __randomize_layout;
> diff --git a/ipc/namespace.c b/ipc/namespace.c
> index ae83f0f2651b..3d151bc5f723 100644
> --- a/ipc/namespace.c
> +++ b/ipc/namespace.c
> @@ -17,6 +17,7 @@
>  #include <linux/proc_ns.h>
>  #include <linux/sched/task.h>
>  
> +#include "../fs/mount.h"
>  #include "util.h"
>  
>  static struct ucounts *inc_ipc_namespaces(struct user_namespace *ns)
> @@ -115,12 +116,11 @@ void free_ipcs(struct ipc_namespace *ns, struct ipc_ids *ids,
>  	up_write(&ids->rwsem);
>  }
>  
> -static void free_ipc_ns(struct ipc_namespace *ns)
> +static void free_ipc_ns(struct work_struct *work)
>  {
> -	/* mq_put_mnt() waits for a grace period as kern_unmount()
> -	 * uses synchronize_rcu().
> -	 */
> -	mq_put_mnt(ns);
> +	struct ipc_namespace *ns = container_of(to_rcu_work(work),
> +				   struct ipc_namespace, free_rwork);
> +	mntput(ns->mq_mnt);
>  	sem_exit_ns(ns);
>  	msg_exit_ns(ns);
>  	shm_exit_ns(ns);
> @@ -131,21 +131,6 @@ static void free_ipc_ns(struct ipc_namespace *ns)
>  	kfree(ns);
>  }
>  
> -static LLIST_HEAD(free_ipc_list);
> -static void free_ipc(struct work_struct *unused)
> -{
> -	struct llist_node *node = llist_del_all(&free_ipc_list);
> -	struct ipc_namespace *n, *t;
> -
> -	llist_for_each_entry_safe(n, t, node, mnt_llist)
> -		free_ipc_ns(n);
> -}
> -
> -/*
> - * The work queue is used to avoid the cost of synchronize_rcu in kern_unmount.
> - */
> -static DECLARE_WORK(free_ipc_work, free_ipc);
> -
>  /*
>   * put_ipc_ns - drop a reference to an ipc namespace.
>   * @ns: the namespace to put
> @@ -166,10 +151,11 @@ void put_ipc_ns(struct ipc_namespace *ns)
>  {
>  	if (refcount_dec_and_lock(&ns->ns.count, &mq_lock)) {
>  		mq_clear_sbinfo(ns);
> +		real_mount(ns->mq_mnt)->mnt_ns = NULL;
>  		spin_unlock(&mq_lock);
>  
> -		if (llist_add(&ns->mnt_llist, &free_ipc_list))
> -			schedule_work(&free_ipc_work);
> +		INIT_RCU_WORK(&ns->free_rwork, free_ipc_ns);
> +		queue_rcu_work(system_wq, &ns->free_rwork);
>  	}
>  }
>  
> -- 
> 2.34.1
> 
> 

  reply	other threads:[~2022-02-17 21:55 UTC|newest]

Thread overview: 8+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-02-17 20:36 [PATCH][RFC] ipc,fs: use rcu_work to free struct ipc_namespace Rik van Riel
2022-02-17 21:55 ` Paul E. McKenney [this message]
2022-02-18  3:29 ` Al Viro
2022-02-18  5:31   ` Paul E. McKenney
2022-02-18 13:01 ` Christian Brauner
2022-02-18 16:08 ` Eric W. Biederman
2022-02-18 16:41   ` Paul E. McKenney
2022-02-18 16:54   ` Rik van Riel

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20220217215531.GT4285@paulmck-ThinkPad-P17-Gen-1 \
    --to=paulmck@kernel.org \
    --cc=clm@fb.com \
    --cc=gscrivan@redhat.com \
    --cc=kernel-team@fb.com \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=riel@surriel.com \
    --cc=viro@zeniv.linux.org.uk \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.