linux-nfs.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH AUTOSEL 5.4 03/35] nfs: Fix potential posix_acl refcnt leak in nfs3_set_acl
       [not found] <20200507142830.26239-1-sashal@kernel.org>
@ 2020-05-07 14:27 ` Sasha Levin
  2020-05-07 14:28 ` [PATCH AUTOSEL 5.4 26/35] SUNRPC: defer slow parts of rpc_free_client() to a workqueue Sasha Levin
  1 sibling, 0 replies; 4+ messages in thread
From: Sasha Levin @ 2020-05-07 14:27 UTC (permalink / raw)
  To: linux-kernel, stable
  Cc: Andreas Gruenbacher, Xiyu Yang, Trond Myklebust, Sasha Levin, linux-nfs

From: Andreas Gruenbacher <agruenba@redhat.com>

[ Upstream commit 7648f939cb919b9d15c21fff8cd9eba908d595dc ]

nfs3_set_acl keeps track of the acl it allocated locally to determine if an acl
needs to be released at the end.  This results in a memory leak when the
function allocates an acl as well as a default acl.  Fix by releasing acls
that differ from the acl originally passed into nfs3_set_acl.

Fixes: b7fa0554cf1b ("[PATCH] NFS: Add support for NFSv3 ACLs")
Reported-by: Xiyu Yang <xiyuyang19@fudan.edu.cn>
Signed-off-by: Andreas Gruenbacher <agruenba@redhat.com>
Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
---
 fs/nfs/nfs3acl.c | 22 +++++++++++++++-------
 1 file changed, 15 insertions(+), 7 deletions(-)

diff --git a/fs/nfs/nfs3acl.c b/fs/nfs/nfs3acl.c
index c5c3fc6e6c600..26c94b32d6f49 100644
--- a/fs/nfs/nfs3acl.c
+++ b/fs/nfs/nfs3acl.c
@@ -253,37 +253,45 @@ int nfs3_proc_setacls(struct inode *inode, struct posix_acl *acl,
 
 int nfs3_set_acl(struct inode *inode, struct posix_acl *acl, int type)
 {
-	struct posix_acl *alloc = NULL, *dfacl = NULL;
+	struct posix_acl *orig = acl, *dfacl = NULL, *alloc;
 	int status;
 
 	if (S_ISDIR(inode->i_mode)) {
 		switch(type) {
 		case ACL_TYPE_ACCESS:
-			alloc = dfacl = get_acl(inode, ACL_TYPE_DEFAULT);
+			alloc = get_acl(inode, ACL_TYPE_DEFAULT);
 			if (IS_ERR(alloc))
 				goto fail;
+			dfacl = alloc;
 			break;
 
 		case ACL_TYPE_DEFAULT:
-			dfacl = acl;
-			alloc = acl = get_acl(inode, ACL_TYPE_ACCESS);
+			alloc = get_acl(inode, ACL_TYPE_ACCESS);
 			if (IS_ERR(alloc))
 				goto fail;
+			dfacl = acl;
+			acl = alloc;
 			break;
 		}
 	}
 
 	if (acl == NULL) {
-		alloc = acl = posix_acl_from_mode(inode->i_mode, GFP_KERNEL);
+		alloc = posix_acl_from_mode(inode->i_mode, GFP_KERNEL);
 		if (IS_ERR(alloc))
 			goto fail;
+		acl = alloc;
 	}
 	status = __nfs3_proc_setacls(inode, acl, dfacl);
-	posix_acl_release(alloc);
+out:
+	if (acl != orig)
+		posix_acl_release(acl);
+	if (dfacl != orig)
+		posix_acl_release(dfacl);
 	return status;
 
 fail:
-	return PTR_ERR(alloc);
+	status = PTR_ERR(alloc);
+	goto out;
 }
 
 const struct xattr_handler *nfs3_xattr_handlers[] = {
-- 
2.20.1


^ permalink raw reply related	[flat|nested] 4+ messages in thread

* [PATCH AUTOSEL 5.4 26/35] SUNRPC: defer slow parts of rpc_free_client() to a workqueue.
       [not found] <20200507142830.26239-1-sashal@kernel.org>
  2020-05-07 14:27 ` [PATCH AUTOSEL 5.4 03/35] nfs: Fix potential posix_acl refcnt leak in nfs3_set_acl Sasha Levin
@ 2020-05-07 14:28 ` Sasha Levin
  2020-05-07 21:18   ` NeilBrown
  1 sibling, 1 reply; 4+ messages in thread
From: Sasha Levin @ 2020-05-07 14:28 UTC (permalink / raw)
  To: linux-kernel, stable
  Cc: NeilBrown, Trond Myklebust, Sasha Levin, linux-nfs, netdev

From: NeilBrown <neilb@suse.de>

[ Upstream commit 7c4310ff56422ea43418305d22bbc5fe19150ec4 ]

The rpciod workqueue is on the write-out path for freeing dirty memory,
so it is important that it never block waiting for memory to be
allocated - this can lead to a deadlock.

rpc_execute() - which is often called by an rpciod work item - calls
rcp_task_release_client() which can lead to rpc_free_client().

rpc_free_client() makes two calls which could potentially block wating
for memory allocation.

rpc_clnt_debugfs_unregister() calls into debugfs and will block while
any of the debugfs files are being accessed.  In particular it can block
while any of the 'open' methods are being called and all of these use
malloc for one thing or another.  So this can deadlock if the memory
allocation waits for NFS to complete some writes via rpciod.

rpc_clnt_remove_pipedir() can take the inode_lock() and while it isn't
obvious that memory allocations can happen while the lock it held, it is
safer to assume they might and to not let rpciod call
rpc_clnt_remove_pipedir().

So this patch moves these two calls (together with the final kfree() and
rpciod_down()) into a work-item to be run from the system work-queue.
rpciod can continue its important work, and the final stages of the free
can happen whenever they happen.

I have seen this deadlock on a 4.12 based kernel where debugfs used
synchronize_srcu() when removing objects.  synchronize_srcu() requires a
workqueue and there were no free workther threads and none could be
allocated.  While debugsfs no longer uses SRCU, I believe the deadlock
is still possible.

Signed-off-by: NeilBrown <neilb@suse.de>
Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
Signed-off-by: Sasha Levin <sashal@kernel.org>
---
 include/linux/sunrpc/clnt.h |  8 +++++++-
 net/sunrpc/clnt.c           | 21 +++++++++++++++++----
 2 files changed, 24 insertions(+), 5 deletions(-)

diff --git a/include/linux/sunrpc/clnt.h b/include/linux/sunrpc/clnt.h
index abc63bd1be2b5..d99d39d45a494 100644
--- a/include/linux/sunrpc/clnt.h
+++ b/include/linux/sunrpc/clnt.h
@@ -71,7 +71,13 @@ struct rpc_clnt {
 #if IS_ENABLED(CONFIG_SUNRPC_DEBUG)
 	struct dentry		*cl_debugfs;	/* debugfs directory */
 #endif
-	struct rpc_xprt_iter	cl_xpi;
+	/* cl_work is only needed after cl_xpi is no longer used,
+	 * and that are of similar size
+	 */
+	union {
+		struct rpc_xprt_iter	cl_xpi;
+		struct work_struct	cl_work;
+	};
 	const struct cred	*cl_cred;
 };
 
diff --git a/net/sunrpc/clnt.c b/net/sunrpc/clnt.c
index f7f78566be463..a7430b66c7389 100644
--- a/net/sunrpc/clnt.c
+++ b/net/sunrpc/clnt.c
@@ -877,6 +877,20 @@ EXPORT_SYMBOL_GPL(rpc_shutdown_client);
 /*
  * Free an RPC client
  */
+static void rpc_free_client_work(struct work_struct *work)
+{
+	struct rpc_clnt *clnt = container_of(work, struct rpc_clnt, cl_work);
+
+	/* These might block on processes that might allocate memory,
+	 * so they cannot be called in rpciod, so they are handled separately
+	 * here.
+	 */
+	rpc_clnt_debugfs_unregister(clnt);
+	rpc_clnt_remove_pipedir(clnt);
+
+	kfree(clnt);
+	rpciod_down();
+}
 static struct rpc_clnt *
 rpc_free_client(struct rpc_clnt *clnt)
 {
@@ -887,17 +901,16 @@ rpc_free_client(struct rpc_clnt *clnt)
 			rcu_dereference(clnt->cl_xprt)->servername);
 	if (clnt->cl_parent != clnt)
 		parent = clnt->cl_parent;
-	rpc_clnt_debugfs_unregister(clnt);
-	rpc_clnt_remove_pipedir(clnt);
 	rpc_unregister_client(clnt);
 	rpc_free_iostats(clnt->cl_metrics);
 	clnt->cl_metrics = NULL;
 	xprt_put(rcu_dereference_raw(clnt->cl_xprt));
 	xprt_iter_destroy(&clnt->cl_xpi);
-	rpciod_down();
 	put_cred(clnt->cl_cred);
 	rpc_free_clid(clnt);
-	kfree(clnt);
+
+	INIT_WORK(&clnt->cl_work, rpc_free_client_work);
+	schedule_work(&clnt->cl_work);
 	return parent;
 }
 
-- 
2.20.1


^ permalink raw reply related	[flat|nested] 4+ messages in thread

* Re: [PATCH AUTOSEL 5.4 26/35] SUNRPC: defer slow parts of rpc_free_client() to a workqueue.
  2020-05-07 14:28 ` [PATCH AUTOSEL 5.4 26/35] SUNRPC: defer slow parts of rpc_free_client() to a workqueue Sasha Levin
@ 2020-05-07 21:18   ` NeilBrown
  2020-05-16 23:10     ` Sasha Levin
  0 siblings, 1 reply; 4+ messages in thread
From: NeilBrown @ 2020-05-07 21:18 UTC (permalink / raw)
  To: Sasha Levin, linux-kernel, stable
  Cc: Trond Myklebust, Sasha Levin, linux-nfs, netdev

[-- Attachment #1: Type: text/plain, Size: 4273 bytes --]

On Thu, May 07 2020, Sasha Levin wrote:

> From: NeilBrown <neilb@suse.de>
>
> [ Upstream commit 7c4310ff56422ea43418305d22bbc5fe19150ec4 ]

This one is buggy - it introduces a use-after-free.  Best delay it for
now.

NeilBrown

>
> The rpciod workqueue is on the write-out path for freeing dirty memory,
> so it is important that it never block waiting for memory to be
> allocated - this can lead to a deadlock.
>
> rpc_execute() - which is often called by an rpciod work item - calls
> rcp_task_release_client() which can lead to rpc_free_client().
>
> rpc_free_client() makes two calls which could potentially block wating
> for memory allocation.
>
> rpc_clnt_debugfs_unregister() calls into debugfs and will block while
> any of the debugfs files are being accessed.  In particular it can block
> while any of the 'open' methods are being called and all of these use
> malloc for one thing or another.  So this can deadlock if the memory
> allocation waits for NFS to complete some writes via rpciod.
>
> rpc_clnt_remove_pipedir() can take the inode_lock() and while it isn't
> obvious that memory allocations can happen while the lock it held, it is
> safer to assume they might and to not let rpciod call
> rpc_clnt_remove_pipedir().
>
> So this patch moves these two calls (together with the final kfree() and
> rpciod_down()) into a work-item to be run from the system work-queue.
> rpciod can continue its important work, and the final stages of the free
> can happen whenever they happen.
>
> I have seen this deadlock on a 4.12 based kernel where debugfs used
> synchronize_srcu() when removing objects.  synchronize_srcu() requires a
> workqueue and there were no free workther threads and none could be
> allocated.  While debugsfs no longer uses SRCU, I believe the deadlock
> is still possible.
>
> Signed-off-by: NeilBrown <neilb@suse.de>
> Signed-off-by: Trond Myklebust <trond.myklebust@hammerspace.com>
> Signed-off-by: Sasha Levin <sashal@kernel.org>
> ---
>  include/linux/sunrpc/clnt.h |  8 +++++++-
>  net/sunrpc/clnt.c           | 21 +++++++++++++++++----
>  2 files changed, 24 insertions(+), 5 deletions(-)
>
> diff --git a/include/linux/sunrpc/clnt.h b/include/linux/sunrpc/clnt.h
> index abc63bd1be2b5..d99d39d45a494 100644
> --- a/include/linux/sunrpc/clnt.h
> +++ b/include/linux/sunrpc/clnt.h
> @@ -71,7 +71,13 @@ struct rpc_clnt {
>  #if IS_ENABLED(CONFIG_SUNRPC_DEBUG)
>  	struct dentry		*cl_debugfs;	/* debugfs directory */
>  #endif
> -	struct rpc_xprt_iter	cl_xpi;
> +	/* cl_work is only needed after cl_xpi is no longer used,
> +	 * and that are of similar size
> +	 */
> +	union {
> +		struct rpc_xprt_iter	cl_xpi;
> +		struct work_struct	cl_work;
> +	};
>  	const struct cred	*cl_cred;
>  };
>  
> diff --git a/net/sunrpc/clnt.c b/net/sunrpc/clnt.c
> index f7f78566be463..a7430b66c7389 100644
> --- a/net/sunrpc/clnt.c
> +++ b/net/sunrpc/clnt.c
> @@ -877,6 +877,20 @@ EXPORT_SYMBOL_GPL(rpc_shutdown_client);
>  /*
>   * Free an RPC client
>   */
> +static void rpc_free_client_work(struct work_struct *work)
> +{
> +	struct rpc_clnt *clnt = container_of(work, struct rpc_clnt, cl_work);
> +
> +	/* These might block on processes that might allocate memory,
> +	 * so they cannot be called in rpciod, so they are handled separately
> +	 * here.
> +	 */
> +	rpc_clnt_debugfs_unregister(clnt);
> +	rpc_clnt_remove_pipedir(clnt);
> +
> +	kfree(clnt);
> +	rpciod_down();
> +}
>  static struct rpc_clnt *
>  rpc_free_client(struct rpc_clnt *clnt)
>  {
> @@ -887,17 +901,16 @@ rpc_free_client(struct rpc_clnt *clnt)
>  			rcu_dereference(clnt->cl_xprt)->servername);
>  	if (clnt->cl_parent != clnt)
>  		parent = clnt->cl_parent;
> -	rpc_clnt_debugfs_unregister(clnt);
> -	rpc_clnt_remove_pipedir(clnt);
>  	rpc_unregister_client(clnt);
>  	rpc_free_iostats(clnt->cl_metrics);
>  	clnt->cl_metrics = NULL;
>  	xprt_put(rcu_dereference_raw(clnt->cl_xprt));
>  	xprt_iter_destroy(&clnt->cl_xpi);
> -	rpciod_down();
>  	put_cred(clnt->cl_cred);
>  	rpc_free_clid(clnt);
> -	kfree(clnt);
> +
> +	INIT_WORK(&clnt->cl_work, rpc_free_client_work);
> +	schedule_work(&clnt->cl_work);
>  	return parent;
>  }
>  
> -- 
> 2.20.1

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 832 bytes --]

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [PATCH AUTOSEL 5.4 26/35] SUNRPC: defer slow parts of rpc_free_client() to a workqueue.
  2020-05-07 21:18   ` NeilBrown
@ 2020-05-16 23:10     ` Sasha Levin
  0 siblings, 0 replies; 4+ messages in thread
From: Sasha Levin @ 2020-05-16 23:10 UTC (permalink / raw)
  To: NeilBrown; +Cc: linux-kernel, stable, Trond Myklebust, linux-nfs, netdev

On Fri, May 08, 2020 at 07:18:53AM +1000, NeilBrown wrote:
>On Thu, May 07 2020, Sasha Levin wrote:
>
>> From: NeilBrown <neilb@suse.de>
>>
>> [ Upstream commit 7c4310ff56422ea43418305d22bbc5fe19150ec4 ]
>
>This one is buggy - it introduces a use-after-free.  Best delay it for
>now.

I've dropped it, thanks!

-- 
Thanks,
Sasha

^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2020-05-16 23:10 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
     [not found] <20200507142830.26239-1-sashal@kernel.org>
2020-05-07 14:27 ` [PATCH AUTOSEL 5.4 03/35] nfs: Fix potential posix_acl refcnt leak in nfs3_set_acl Sasha Levin
2020-05-07 14:28 ` [PATCH AUTOSEL 5.4 26/35] SUNRPC: defer slow parts of rpc_free_client() to a workqueue Sasha Levin
2020-05-07 21:18   ` NeilBrown
2020-05-16 23:10     ` Sasha Levin

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).