linux-nfs.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH net 0/2] auth_gss: Fix netns refcount leaks when use-gss-proxy==1
@ 2021-09-28  3:14 Wang Hai
  2021-09-28  3:14 ` [PATCH net 1/2] net: Modify unix_stream_connect to not reference count the netns of kernel sockets Wang Hai
  2021-09-28  3:14 ` [PATCH net 2/2] auth_gss: Fix deadlock that blocks rpcsec_gss_exit_net when use-gss-proxy==1 Wang Hai
  0 siblings, 2 replies; 16+ messages in thread
From: Wang Hai @ 2021-09-28  3:14 UTC (permalink / raw)
  To: bfields, davem, kuba, wenbin.zeng, jlayton, dsahern,
	nicolas.dichtel, viro, willy, jakub.kicinski, tyhicks, cong.wang,
	ast, jiang.wang, christian.brauner, edumazet, Rao.Shoaib, kuniyu,
	trond.myklebust, anna.schumaker, chuck.lever, neilb, kolga, timo,
	tom
  Cc: linux-nfs, netdev, linux-kernel

When use-gss-proxy is set to 1, write_gssp() creates a rpc client in
gssp_rpc_create(), this increases netns refcount by 2 [1], these
refcounts are supposed to be released in rpcsec_gss_exit_net(), but
it will never happen because rpcsec_gss_exit_net() is triggered only
when netns refcount gets to 0, specifically:
     refcount=0 -> cleanup_net() -> ops_exit_list -> rpcsec_gss_exit_net
It is a deadlock situation here, refcount will never get to 0 unless
rpcsec_gss_exit_net() is called. 

[1]
SyS_write
    vfs_write
        proc_reg_write
            write_gssp
                set_gssp_clnt
                    gssp_rpc_create
                        rpc_create
                            xprt_create_transport
                                xs_setup_local
                                    xs_setup_xprt
                                        xprt_alloc   // get net refcount
                                    xs_local_setup_socket
                                        unix_create
                                        kernel_connect // get net refcount

In this case, the net refcount shouldn't be increased when creating rpc
client, otherwise it will lead to deadlock.

This patchset removes the increased netns reference count.

Wang Hai (2):
  net: Modify unix_stream_connect to not reference count the netns of
    kernel sockets
  auth_gss: Fix deadlock that blocks rpcsec_gss_exit_net when
    use-gss-proxy==1

 include/linux/sunrpc/clnt.h                |  1 +
 include/linux/sunrpc/xprt.h                |  6 ++++--
 net/sunrpc/auth_gss/gss_rpc_upcall.c       |  3 ++-
 net/sunrpc/clnt.c                          |  2 ++
 net/sunrpc/xprt.c                          | 13 +++++++++----
 net/sunrpc/xprtrdma/svc_rdma_backchannel.c |  2 +-
 net/sunrpc/xprtrdma/transport.c            |  2 +-
 net/sunrpc/xprtsock.c                      |  4 +++-
 net/unix/af_unix.c                         |  3 ++-
 9 files changed, 25 insertions(+), 11 deletions(-)

-- 
2.17.1


^ permalink raw reply	[flat|nested] 16+ messages in thread

* [PATCH net 1/2] net: Modify unix_stream_connect to not reference count the netns of kernel sockets
  2021-09-28  3:14 [PATCH net 0/2] auth_gss: Fix netns refcount leaks when use-gss-proxy==1 Wang Hai
@ 2021-09-28  3:14 ` Wang Hai
  2021-09-28 12:50   ` Kuniyuki Iwashima
  2021-09-28  3:14 ` [PATCH net 2/2] auth_gss: Fix deadlock that blocks rpcsec_gss_exit_net when use-gss-proxy==1 Wang Hai
  1 sibling, 1 reply; 16+ messages in thread
From: Wang Hai @ 2021-09-28  3:14 UTC (permalink / raw)
  To: bfields, davem, kuba, wenbin.zeng, jlayton, dsahern,
	nicolas.dichtel, viro, willy, jakub.kicinski, tyhicks, cong.wang,
	ast, jiang.wang, christian.brauner, edumazet, Rao.Shoaib, kuniyu,
	trond.myklebust, anna.schumaker, chuck.lever, neilb, kolga, timo,
	tom
  Cc: linux-nfs, netdev, linux-kernel

When use-gss-proxy is set to 1, write_gssp() creates a rpc client in
gssp_rpc_create(), this increases netns refcount by 2, these refcounts
are supposed to be released in rpcsec_gss_exit_net(), but it will never
happen because rpcsec_gss_exit_net() is triggered only when netns
refcount gets to 0, specifically:
    refcount=0 -> cleanup_net() -> ops_exit_list -> rpcsec_gss_exit_net
It is a deadlock situation here, refcount will never get to 0 unless
rpcsec_gss_exit_net() is called. So, in this case, the netns refcount
should not be increased.

In this case, kernel_connect()->unix_stream_connect() will take a netns
refcount. According to commit 26abe14379f8 ("net: Modify sk_alloc to not
reference count the netns of kernel sockets."), kernel sockets should not
take the netns refcount, so unix_stream_connect() should not take
the netns refcount when the sock is a kernel socket either.

Signed-off-by: Wang Hai <wanghai38@huawei.com>
---
 net/unix/af_unix.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/net/unix/af_unix.c b/net/unix/af_unix.c
index 92345c9bb60c..af6ba67779c8 100644
--- a/net/unix/af_unix.c
+++ b/net/unix/af_unix.c
@@ -1317,7 +1317,8 @@ static int unix_stream_connect(struct socket *sock, struct sockaddr *uaddr,
 	err = -ENOMEM;
 
 	/* create new sock for complete connection */
-	newsk = unix_create1(sock_net(sk), NULL, 0, sock->type);
+	newsk = unix_create1(sock_net(sk), NULL,
+			     !sk->sk_net_refcnt, sock->type);
 	if (newsk == NULL)
 		goto out;
 
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [PATCH net 2/2] auth_gss: Fix deadlock that blocks rpcsec_gss_exit_net when use-gss-proxy==1
  2021-09-28  3:14 [PATCH net 0/2] auth_gss: Fix netns refcount leaks when use-gss-proxy==1 Wang Hai
  2021-09-28  3:14 ` [PATCH net 1/2] net: Modify unix_stream_connect to not reference count the netns of kernel sockets Wang Hai
@ 2021-09-28  3:14 ` Wang Hai
  2021-09-28 13:30   ` Trond Myklebust
  1 sibling, 1 reply; 16+ messages in thread
From: Wang Hai @ 2021-09-28  3:14 UTC (permalink / raw)
  To: bfields, davem, kuba, wenbin.zeng, jlayton, dsahern,
	nicolas.dichtel, viro, willy, jakub.kicinski, tyhicks, cong.wang,
	ast, jiang.wang, christian.brauner, edumazet, Rao.Shoaib, kuniyu,
	trond.myklebust, anna.schumaker, chuck.lever, neilb, kolga, timo,
	tom
  Cc: linux-nfs, netdev, linux-kernel

When use-gss-proxy is set to 1, write_gssp() creates a rpc client in
gssp_rpc_create(), this increases the netns refcount by 2, these
refcounts are supposed to be released in rpcsec_gss_exit_net(), but it
will never happen because rpcsec_gss_exit_net() is triggered only when
the netns refcount gets to 0, specifically:
    refcount=0 -> cleanup_net() -> ops_exit_list -> rpcsec_gss_exit_net
It is a deadlock situation here, refcount will never get to 0 unless
rpcsec_gss_exit_net() is called. So, in this case, the netns refcount
should not be increased.

In this case, xprt will take a netns refcount which is not supposed
to be taken. Add a new flag to rpc_create_args called
RPC_CLNT_CREATE_NO_NET_REF for not increasing the netns refcount.

It is safe not to hold the netns refcount, because when cleanup_net(), it
will hold the gssp_lock and then shut down the rpc client synchronously.

Signed-off-by: Wang Hai <wanghai38@huawei.com>
---
 include/linux/sunrpc/clnt.h                |  1 +
 include/linux/sunrpc/xprt.h                |  6 ++++--
 net/sunrpc/auth_gss/gss_rpc_upcall.c       |  3 ++-
 net/sunrpc/clnt.c                          |  2 ++
 net/sunrpc/xprt.c                          | 13 +++++++++----
 net/sunrpc/xprtrdma/svc_rdma_backchannel.c |  2 +-
 net/sunrpc/xprtrdma/transport.c            |  2 +-
 net/sunrpc/xprtsock.c                      |  4 +++-
 8 files changed, 23 insertions(+), 10 deletions(-)

diff --git a/include/linux/sunrpc/clnt.h b/include/linux/sunrpc/clnt.h
index a4661646adc9..6323518e7b1c 100644
--- a/include/linux/sunrpc/clnt.h
+++ b/include/linux/sunrpc/clnt.h
@@ -159,6 +159,7 @@ struct rpc_add_xprt_test {
 #define RPC_CLNT_CREATE_NO_RETRANS_TIMEOUT	(1UL << 9)
 #define RPC_CLNT_CREATE_SOFTERR		(1UL << 10)
 #define RPC_CLNT_CREATE_REUSEPORT	(1UL << 11)
+#define RPC_CLNT_CREATE_NO_NET_REF	(1UL << 12)
 
 struct rpc_clnt *rpc_create(struct rpc_create_args *args);
 struct rpc_clnt	*rpc_bind_new_program(struct rpc_clnt *,
diff --git a/include/linux/sunrpc/xprt.h b/include/linux/sunrpc/xprt.h
index 955ea4d7af0b..cc518a58b93c 100644
--- a/include/linux/sunrpc/xprt.h
+++ b/include/linux/sunrpc/xprt.h
@@ -284,6 +284,7 @@ struct rpc_xprt {
 	} stat;
 
 	struct net		*xprt_net;
+	bool			xprt_net_refcnt;
 	const char		*servername;
 	const char		*address_strings[RPC_DISPLAY_MAX];
 #if IS_ENABLED(CONFIG_SUNRPC_DEBUG)
@@ -317,6 +318,7 @@ static inline int bc_prealloc(struct rpc_rqst *req)
 
 #define XPRT_CREATE_INFINITE_SLOTS	(1U)
 #define XPRT_CREATE_NO_IDLE_TIMEOUT	(1U << 1)
+#define XPRT_CREATE_NO_NET_REF		(1U << 2)
 
 struct xprt_create {
 	int			ident;		/* XPRT_TRANSPORT identifier */
@@ -370,8 +372,8 @@ void			xprt_release(struct rpc_task *task);
 struct rpc_xprt *	xprt_get(struct rpc_xprt *xprt);
 void			xprt_put(struct rpc_xprt *xprt);
 struct rpc_xprt *	xprt_alloc(struct net *net, size_t size,
-				unsigned int num_prealloc,
-				unsigned int max_req);
+				   unsigned int num_prealloc,
+				   unsigned int max_req, bool xprt_net_refcnt);
 void			xprt_free(struct rpc_xprt *);
 void			xprt_add_backlog(struct rpc_xprt *xprt, struct rpc_task *task);
 bool			xprt_wake_up_backlog(struct rpc_xprt *xprt, struct rpc_rqst *req);
diff --git a/net/sunrpc/auth_gss/gss_rpc_upcall.c b/net/sunrpc/auth_gss/gss_rpc_upcall.c
index 61c276bddaf2..8c35805470d5 100644
--- a/net/sunrpc/auth_gss/gss_rpc_upcall.c
+++ b/net/sunrpc/auth_gss/gss_rpc_upcall.c
@@ -98,7 +98,8 @@ static int gssp_rpc_create(struct net *net, struct rpc_clnt **_clnt)
 		 * done without the correct namespace:
 		 */
 		.flags		= RPC_CLNT_CREATE_NOPING |
-				  RPC_CLNT_CREATE_NO_IDLE_TIMEOUT
+				  RPC_CLNT_CREATE_NO_IDLE_TIMEOUT |
+				  RPC_CLNT_CREATE_NO_NET_REF
 	};
 	struct rpc_clnt *clnt;
 	int result = 0;
diff --git a/net/sunrpc/clnt.c b/net/sunrpc/clnt.c
index f056ff931444..672bebae50ca 100644
--- a/net/sunrpc/clnt.c
+++ b/net/sunrpc/clnt.c
@@ -543,6 +543,8 @@ struct rpc_clnt *rpc_create(struct rpc_create_args *args)
 		xprtargs.flags |= XPRT_CREATE_INFINITE_SLOTS;
 	if (args->flags & RPC_CLNT_CREATE_NO_IDLE_TIMEOUT)
 		xprtargs.flags |= XPRT_CREATE_NO_IDLE_TIMEOUT;
+	if (args->flags & RPC_CLNT_CREATE_NO_NET_REF)
+		xprtargs.flags |= XPRT_CREATE_NO_NET_REF;
 	/*
 	 * If the caller chooses not to specify a hostname, whip
 	 * up a string representation of the passed-in address.
diff --git a/net/sunrpc/xprt.c b/net/sunrpc/xprt.c
index cfd681700d1a..ffa6de06f21b 100644
--- a/net/sunrpc/xprt.c
+++ b/net/sunrpc/xprt.c
@@ -1794,8 +1794,8 @@ static void xprt_free_id(struct rpc_xprt *xprt)
 }
 
 struct rpc_xprt *xprt_alloc(struct net *net, size_t size,
-		unsigned int num_prealloc,
-		unsigned int max_alloc)
+			    unsigned int num_prealloc,
+			    unsigned int max_alloc, bool xprt_net_refcnt)
 {
 	struct rpc_xprt *xprt;
 	struct rpc_rqst *req;
@@ -1806,6 +1806,7 @@ struct rpc_xprt *xprt_alloc(struct net *net, size_t size,
 		goto out;
 
 	xprt_alloc_id(xprt);
+	xprt->xprt_net_refcnt = xprt_net_refcnt;
 	xprt_init(xprt, net);
 
 	for (i = 0; i < num_prealloc; i++) {
@@ -1832,7 +1833,8 @@ EXPORT_SYMBOL_GPL(xprt_alloc);
 
 void xprt_free(struct rpc_xprt *xprt)
 {
-	put_net(xprt->xprt_net);
+	if (xprt->xprt_net_refcnt)
+		put_net(xprt->xprt_net);
 	xprt_free_all_slots(xprt);
 	xprt_free_id(xprt);
 	rpc_sysfs_xprt_destroy(xprt);
@@ -2024,7 +2026,10 @@ static void xprt_init(struct rpc_xprt *xprt, struct net *net)
 
 	xprt_init_xid(xprt);
 
-	xprt->xprt_net = get_net(net);
+	if (xprt->xprt_net_refcnt)
+		xprt->xprt_net = get_net(net);
+	else
+		xprt->xprt_net = net;
 }
 
 /**
diff --git a/net/sunrpc/xprtrdma/svc_rdma_backchannel.c b/net/sunrpc/xprtrdma/svc_rdma_backchannel.c
index 16897fcb659c..8cd7a38da0f0 100644
--- a/net/sunrpc/xprtrdma/svc_rdma_backchannel.c
+++ b/net/sunrpc/xprtrdma/svc_rdma_backchannel.c
@@ -251,7 +251,7 @@ xprt_setup_rdma_bc(struct xprt_create *args)
 
 	xprt = xprt_alloc(args->net, sizeof(*new_xprt),
 			  RPCRDMA_MAX_BC_REQUESTS,
-			  RPCRDMA_MAX_BC_REQUESTS);
+			  RPCRDMA_MAX_BC_REQUESTS, true);
 	if (!xprt)
 		return ERR_PTR(-ENOMEM);
 
diff --git a/net/sunrpc/xprtrdma/transport.c b/net/sunrpc/xprtrdma/transport.c
index 16e5696314a4..f0f7faa571f6 100644
--- a/net/sunrpc/xprtrdma/transport.c
+++ b/net/sunrpc/xprtrdma/transport.c
@@ -323,7 +323,7 @@ xprt_setup_rdma(struct xprt_create *args)
 		return ERR_PTR(-EIO);
 
 	xprt = xprt_alloc(args->net, sizeof(struct rpcrdma_xprt), 0,
-			  xprt_rdma_slot_table_entries);
+			  xprt_rdma_slot_table_entries, true);
 	if (!xprt) {
 		module_put(THIS_MODULE);
 		return ERR_PTR(-ENOMEM);
diff --git a/net/sunrpc/xprtsock.c b/net/sunrpc/xprtsock.c
index 04f1b78bcbca..b6c15159992a 100644
--- a/net/sunrpc/xprtsock.c
+++ b/net/sunrpc/xprtsock.c
@@ -2740,6 +2740,8 @@ static struct rpc_xprt *xs_setup_xprt(struct xprt_create *args,
 {
 	struct rpc_xprt *xprt;
 	struct sock_xprt *new;
+	bool xprt_net_refcnt = args->flags & XPRT_CREATE_NO_NET_REF ?
+			       false : true;
 
 	if (args->addrlen > sizeof(xprt->addr)) {
 		dprintk("RPC:       xs_setup_xprt: address too large\n");
@@ -2747,7 +2749,7 @@ static struct rpc_xprt *xs_setup_xprt(struct xprt_create *args,
 	}
 
 	xprt = xprt_alloc(args->net, sizeof(*new), slot_table_size,
-			max_slot_table_size);
+			  max_slot_table_size, xprt_net_refcnt);
 	if (xprt == NULL) {
 		dprintk("RPC:       xs_setup_xprt: couldn't allocate "
 				"rpc_xprt\n");
-- 
2.17.1


^ permalink raw reply related	[flat|nested] 16+ messages in thread

* Re: [PATCH net 1/2] net: Modify unix_stream_connect to not reference count the netns of kernel sockets
  2021-09-28  3:14 ` [PATCH net 1/2] net: Modify unix_stream_connect to not reference count the netns of kernel sockets Wang Hai
@ 2021-09-28 12:50   ` Kuniyuki Iwashima
  0 siblings, 0 replies; 16+ messages in thread
From: Kuniyuki Iwashima @ 2021-09-28 12:50 UTC (permalink / raw)
  To: wanghai38
  Cc: Rao.Shoaib, anna.schumaker, ast, bfields, christian.brauner,
	chuck.lever, cong.wang, davem, dsahern, edumazet, jakub.kicinski,
	jiang.wang, jlayton, kolga, kuba, kuniyu, linux-kernel,
	linux-nfs, neilb, netdev, nicolas.dichtel, timo, tom,
	trond.myklebust, tyhicks, viro, wenbin.zeng, willy

From:   Wang Hai <wanghai38@huawei.com>
Date:   Tue, 28 Sep 2021 11:14:39 +0800
> When use-gss-proxy is set to 1, write_gssp() creates a rpc client in
> gssp_rpc_create(), this increases netns refcount by 2, these refcounts
> are supposed to be released in rpcsec_gss_exit_net(), but it will never
> happen because rpcsec_gss_exit_net() is triggered only when netns
> refcount gets to 0, specifically:
>     refcount=0 -> cleanup_net() -> ops_exit_list -> rpcsec_gss_exit_net
> It is a deadlock situation here, refcount will never get to 0 unless
> rpcsec_gss_exit_net() is called. So, in this case, the netns refcount
> should not be increased.
> 
> In this case, kernel_connect()->unix_stream_connect() will take a netns
> refcount. According to commit 26abe14379f8 ("net: Modify sk_alloc to not
> reference count the netns of kernel sockets."), kernel sockets should not
> take the netns refcount, so unix_stream_connect() should not take
> the netns refcount when the sock is a kernel socket either.
> 
> Signed-off-by: Wang Hai <wanghai38@huawei.com>
> ---
>  net/unix/af_unix.c | 3 ++-
>  1 file changed, 2 insertions(+), 1 deletion(-)
> 
> diff --git a/net/unix/af_unix.c b/net/unix/af_unix.c
> index 92345c9bb60c..af6ba67779c8 100644
> --- a/net/unix/af_unix.c
> +++ b/net/unix/af_unix.c
> @@ -1317,7 +1317,8 @@ static int unix_stream_connect(struct socket *sock, struct sockaddr *uaddr,
>  	err = -ENOMEM;
>  
>  	/* create new sock for complete connection */
> -	newsk = unix_create1(sock_net(sk), NULL, 0, sock->type);
> +	newsk = unix_create1(sock_net(sk), NULL,
> +			     !sk->sk_net_refcnt, sock->type);

This patch conflicts with the commit f4bd73b5a950 for now.
https://git.kernel.org/pub/scm/linux/kernel/git/netdev/net.git/commit/?id=f4bd73b5a950

Could you please rebase and resend the patch set?


>  	if (newsk == NULL)
>  		goto out;
>  
> -- 
> 2.17.1

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH net 2/2] auth_gss: Fix deadlock that blocks rpcsec_gss_exit_net when use-gss-proxy==1
  2021-09-28  3:14 ` [PATCH net 2/2] auth_gss: Fix deadlock that blocks rpcsec_gss_exit_net when use-gss-proxy==1 Wang Hai
@ 2021-09-28 13:30   ` Trond Myklebust
  2021-09-28 13:49     ` bfields
  0 siblings, 1 reply; 16+ messages in thread
From: Trond Myklebust @ 2021-09-28 13:30 UTC (permalink / raw)
  To: neilb, timo, bfields, tyhicks, davem, wanghai38, nicolas.dichtel,
	edumazet, jlayton, dsahern, christian.brauner, jiang.wang,
	anna.schumaker, viro, kuba, cong.wang, ast, kuniyu, willy,
	wenbin.zeng, tom, chuck.lever, Rao.Shoaib, jakub.kicinski, kolga
  Cc: linux-nfs, netdev, linux-kernel

On Tue, 2021-09-28 at 11:14 +0800, Wang Hai wrote:
> When use-gss-proxy is set to 1, write_gssp() creates a rpc client in
> gssp_rpc_create(), this increases the netns refcount by 2, these
> refcounts are supposed to be released in rpcsec_gss_exit_net(), but
> it
> will never happen because rpcsec_gss_exit_net() is triggered only
> when
> the netns refcount gets to 0, specifically:
>     refcount=0 -> cleanup_net() -> ops_exit_list ->
> rpcsec_gss_exit_net
> It is a deadlock situation here, refcount will never get to 0 unless
> rpcsec_gss_exit_net() is called. So, in this case, the netns refcount
> should not be increased.
> 
> In this case, xprt will take a netns refcount which is not supposed
> to be taken. Add a new flag to rpc_create_args called
> RPC_CLNT_CREATE_NO_NET_REF for not increasing the netns refcount.
> 
> It is safe not to hold the netns refcount, because when
> cleanup_net(), it
> will hold the gssp_lock and then shut down the rpc client
> synchronously.
> 
> 
I don't like this solution at all. Adding this kind of flag is going to
lead to problems down the road.

Is there any reason whatsoever why we need this RPC client to exist
when there is no active knfsd server? IOW: Is there any reason why we
shouldn't defer creating this RPC client for when knfsd starts up in
this net namespace, and why we can't shut it down when knfsd shuts
down?

-- 
Trond Myklebust
Linux NFS client maintainer, Hammerspace
trond.myklebust@hammerspace.com



^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH net 2/2] auth_gss: Fix deadlock that blocks rpcsec_gss_exit_net when use-gss-proxy==1
  2021-09-28 13:30   ` Trond Myklebust
@ 2021-09-28 13:49     ` bfields
  2021-09-28 14:04       ` Trond Myklebust
  0 siblings, 1 reply; 16+ messages in thread
From: bfields @ 2021-09-28 13:49 UTC (permalink / raw)
  To: Trond Myklebust
  Cc: neilb, timo, tyhicks, davem, wanghai38, nicolas.dichtel,
	edumazet, jlayton, dsahern, christian.brauner, jiang.wang,
	anna.schumaker, viro, kuba, cong.wang, ast, kuniyu, willy,
	wenbin.zeng, tom, chuck.lever, Rao.Shoaib, jakub.kicinski, kolga,
	linux-nfs, netdev, linux-kernel

On Tue, Sep 28, 2021 at 01:30:17PM +0000, Trond Myklebust wrote:
> On Tue, 2021-09-28 at 11:14 +0800, Wang Hai wrote:
> > When use-gss-proxy is set to 1, write_gssp() creates a rpc client in
> > gssp_rpc_create(), this increases the netns refcount by 2, these
> > refcounts are supposed to be released in rpcsec_gss_exit_net(), but
> > it
> > will never happen because rpcsec_gss_exit_net() is triggered only
> > when
> > the netns refcount gets to 0, specifically:
> >     refcount=0 -> cleanup_net() -> ops_exit_list ->
> > rpcsec_gss_exit_net
> > It is a deadlock situation here, refcount will never get to 0 unless
> > rpcsec_gss_exit_net() is called. So, in this case, the netns refcount
> > should not be increased.
> > 
> > In this case, xprt will take a netns refcount which is not supposed
> > to be taken. Add a new flag to rpc_create_args called
> > RPC_CLNT_CREATE_NO_NET_REF for not increasing the netns refcount.
> > 
> > It is safe not to hold the netns refcount, because when
> > cleanup_net(), it
> > will hold the gssp_lock and then shut down the rpc client
> > synchronously.
> > 
> > 
> I don't like this solution at all. Adding this kind of flag is going to
> lead to problems down the road.
> 
> Is there any reason whatsoever why we need this RPC client to exist
> when there is no active knfsd server? IOW: Is there any reason why we
> shouldn't defer creating this RPC client for when knfsd starts up in
> this net namespace, and why we can't shut it down when knfsd shuts
> down?

The rpc create is done in the context of the process that writes to
/proc/net/rpc/use-gss-proxy to get the right namespaces.  I don't know
how hard it would be capture that information for a later create.

--b.

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH net 2/2] auth_gss: Fix deadlock that blocks rpcsec_gss_exit_net when use-gss-proxy==1
  2021-09-28 13:49     ` bfields
@ 2021-09-28 14:04       ` Trond Myklebust
  2021-09-28 14:17         ` bfields
  0 siblings, 1 reply; 16+ messages in thread
From: Trond Myklebust @ 2021-09-28 14:04 UTC (permalink / raw)
  To: bfields
  Cc: neilb, kolga, willy, tyhicks, davem, wanghai38, nicolas.dichtel,
	linux-kernel, edumazet, jlayton, dsahern, christian.brauner, ast,
	anna.schumaker, linux-nfs, viro, kuba, cong.wang, chuck.lever,
	kuniyu, timo, jiang.wang, wenbin.zeng, netdev, Rao.Shoaib,
	jakub.kicinski, tom

On Tue, 2021-09-28 at 09:49 -0400, bfields@fieldses.org wrote:
> On Tue, Sep 28, 2021 at 01:30:17PM +0000, Trond Myklebust wrote:
> > On Tue, 2021-09-28 at 11:14 +0800, Wang Hai wrote:
> > > When use-gss-proxy is set to 1, write_gssp() creates a rpc client
> > > in
> > > gssp_rpc_create(), this increases the netns refcount by 2, these
> > > refcounts are supposed to be released in rpcsec_gss_exit_net(),
> > > but
> > > it
> > > will never happen because rpcsec_gss_exit_net() is triggered only
> > > when
> > > the netns refcount gets to 0, specifically:
> > >     refcount=0 -> cleanup_net() -> ops_exit_list ->
> > > rpcsec_gss_exit_net
> > > It is a deadlock situation here, refcount will never get to 0
> > > unless
> > > rpcsec_gss_exit_net() is called. So, in this case, the netns
> > > refcount
> > > should not be increased.
> > > 
> > > In this case, xprt will take a netns refcount which is not
> > > supposed
> > > to be taken. Add a new flag to rpc_create_args called
> > > RPC_CLNT_CREATE_NO_NET_REF for not increasing the netns refcount.
> > > 
> > > It is safe not to hold the netns refcount, because when
> > > cleanup_net(), it
> > > will hold the gssp_lock and then shut down the rpc client
> > > synchronously.
> > > 
> > > 
> > I don't like this solution at all. Adding this kind of flag is
> > going to
> > lead to problems down the road.
> > 
> > Is there any reason whatsoever why we need this RPC client to exist
> > when there is no active knfsd server? IOW: Is there any reason why
> > we
> > shouldn't defer creating this RPC client for when knfsd starts up
> > in
> > this net namespace, and why we can't shut it down when knfsd shuts
> > down?
> 
> The rpc create is done in the context of the process that writes to
> /proc/net/rpc/use-gss-proxy to get the right namespaces.  I don't
> know
> how hard it would be capture that information for a later create.
> 

svcauth_gss_proxy_init() uses the net namespace SVC_NET(rqstp) (i.e.
the knfsd namespace) in the call to gssp_accept_sec_context_upcall().

IOW: the net namespace used in the call to find the RPC client is the
one set up by knfsd, and so if use-gss-proxy was set in a different
namespace than the one used by knfsd, then it won't be found.

-- 
Trond Myklebust
Linux NFS client maintainer, Hammerspace
trond.myklebust@hammerspace.com



^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH net 2/2] auth_gss: Fix deadlock that blocks rpcsec_gss_exit_net when use-gss-proxy==1
  2021-09-28 14:04       ` Trond Myklebust
@ 2021-09-28 14:17         ` bfields
  2021-09-28 14:27           ` Trond Myklebust
  0 siblings, 1 reply; 16+ messages in thread
From: bfields @ 2021-09-28 14:17 UTC (permalink / raw)
  To: Trond Myklebust
  Cc: neilb, kolga, willy, tyhicks, davem, wanghai38, nicolas.dichtel,
	linux-kernel, edumazet, jlayton, dsahern, christian.brauner, ast,
	anna.schumaker, linux-nfs, viro, kuba, cong.wang, chuck.lever,
	kuniyu, timo, jiang.wang, wenbin.zeng, netdev, Rao.Shoaib,
	jakub.kicinski, tom

On Tue, Sep 28, 2021 at 02:04:49PM +0000, Trond Myklebust wrote:
> On Tue, 2021-09-28 at 09:49 -0400, bfields@fieldses.org wrote:
> > On Tue, Sep 28, 2021 at 01:30:17PM +0000, Trond Myklebust wrote:
> > > On Tue, 2021-09-28 at 11:14 +0800, Wang Hai wrote:
> > > > When use-gss-proxy is set to 1, write_gssp() creates a rpc client
> > > > in
> > > > gssp_rpc_create(), this increases the netns refcount by 2, these
> > > > refcounts are supposed to be released in rpcsec_gss_exit_net(),
> > > > but
> > > > it
> > > > will never happen because rpcsec_gss_exit_net() is triggered only
> > > > when
> > > > the netns refcount gets to 0, specifically:
> > > >     refcount=0 -> cleanup_net() -> ops_exit_list ->
> > > > rpcsec_gss_exit_net
> > > > It is a deadlock situation here, refcount will never get to 0
> > > > unless
> > > > rpcsec_gss_exit_net() is called. So, in this case, the netns
> > > > refcount
> > > > should not be increased.
> > > > 
> > > > In this case, xprt will take a netns refcount which is not
> > > > supposed
> > > > to be taken. Add a new flag to rpc_create_args called
> > > > RPC_CLNT_CREATE_NO_NET_REF for not increasing the netns refcount.
> > > > 
> > > > It is safe not to hold the netns refcount, because when
> > > > cleanup_net(), it
> > > > will hold the gssp_lock and then shut down the rpc client
> > > > synchronously.
> > > > 
> > > > 
> > > I don't like this solution at all. Adding this kind of flag is
> > > going to
> > > lead to problems down the road.
> > > 
> > > Is there any reason whatsoever why we need this RPC client to exist
> > > when there is no active knfsd server? IOW: Is there any reason why
> > > we
> > > shouldn't defer creating this RPC client for when knfsd starts up
> > > in
> > > this net namespace, and why we can't shut it down when knfsd shuts
> > > down?
> > 
> > The rpc create is done in the context of the process that writes to
> > /proc/net/rpc/use-gss-proxy to get the right namespaces.  I don't
> > know
> > how hard it would be capture that information for a later create.
> > 
> 
> svcauth_gss_proxy_init() uses the net namespace SVC_NET(rqstp) (i.e.
> the knfsd namespace) in the call to gssp_accept_sec_context_upcall().
> 
> IOW: the net namespace used in the call to find the RPC client is the
> one set up by knfsd, and so if use-gss-proxy was set in a different
> namespace than the one used by knfsd, then it won't be found.

Right.  If you've got multiple containers, you don't want to find a
gss-proxy from a different container.

--b.

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH net 2/2] auth_gss: Fix deadlock that blocks rpcsec_gss_exit_net when use-gss-proxy==1
  2021-09-28 14:17         ` bfields
@ 2021-09-28 14:27           ` Trond Myklebust
  2021-09-28 14:57             ` bfields
  0 siblings, 1 reply; 16+ messages in thread
From: Trond Myklebust @ 2021-09-28 14:27 UTC (permalink / raw)
  To: bfields
  Cc: neilb, tom, Rao.Shoaib, willy, tyhicks, davem, wanghai38,
	nicolas.dichtel, linux-kernel, edumazet, ast, jlayton,
	christian.brauner, chuck.lever, linux-nfs, viro, anna.schumaker,
	kuba, cong.wang, dsahern, timo, jiang.wang, kuniyu, netdev,
	wenbin.zeng, kolga, jakub.kicinski

On Tue, 2021-09-28 at 10:17 -0400, bfields@fieldses.org wrote:
> On Tue, Sep 28, 2021 at 02:04:49PM +0000, Trond Myklebust wrote:
> > On Tue, 2021-09-28 at 09:49 -0400, bfields@fieldses.org wrote:
> > > On Tue, Sep 28, 2021 at 01:30:17PM +0000, Trond Myklebust wrote:
> > > > On Tue, 2021-09-28 at 11:14 +0800, Wang Hai wrote:
> > > > > When use-gss-proxy is set to 1, write_gssp() creates a rpc
> > > > > client
> > > > > in
> > > > > gssp_rpc_create(), this increases the netns refcount by 2,
> > > > > these
> > > > > refcounts are supposed to be released in
> > > > > rpcsec_gss_exit_net(),
> > > > > but
> > > > > it
> > > > > will never happen because rpcsec_gss_exit_net() is triggered
> > > > > only
> > > > > when
> > > > > the netns refcount gets to 0, specifically:
> > > > >     refcount=0 -> cleanup_net() -> ops_exit_list ->
> > > > > rpcsec_gss_exit_net
> > > > > It is a deadlock situation here, refcount will never get to 0
> > > > > unless
> > > > > rpcsec_gss_exit_net() is called. So, in this case, the netns
> > > > > refcount
> > > > > should not be increased.
> > > > > 
> > > > > In this case, xprt will take a netns refcount which is not
> > > > > supposed
> > > > > to be taken. Add a new flag to rpc_create_args called
> > > > > RPC_CLNT_CREATE_NO_NET_REF for not increasing the netns
> > > > > refcount.
> > > > > 
> > > > > It is safe not to hold the netns refcount, because when
> > > > > cleanup_net(), it
> > > > > will hold the gssp_lock and then shut down the rpc client
> > > > > synchronously.
> > > > > 
> > > > > 
> > > > I don't like this solution at all. Adding this kind of flag is
> > > > going to
> > > > lead to problems down the road.
> > > > 
> > > > Is there any reason whatsoever why we need this RPC client to
> > > > exist
> > > > when there is no active knfsd server? IOW: Is there any reason
> > > > why
> > > > we
> > > > shouldn't defer creating this RPC client for when knfsd starts
> > > > up
> > > > in
> > > > this net namespace, and why we can't shut it down when knfsd
> > > > shuts
> > > > down?
> > > 
> > > The rpc create is done in the context of the process that writes
> > > to
> > > /proc/net/rpc/use-gss-proxy to get the right namespaces.  I don't
> > > know
> > > how hard it would be capture that information for a later create.
> > > 
> > 
> > svcauth_gss_proxy_init() uses the net namespace SVC_NET(rqstp)
> > (i.e.
> > the knfsd namespace) in the call to
> > gssp_accept_sec_context_upcall().
> > 
> > IOW: the net namespace used in the call to find the RPC client is
> > the
> > one set up by knfsd, and so if use-gss-proxy was set in a different
> > namespace than the one used by knfsd, then it won't be found.
> 
> Right.  If you've got multiple containers, you don't want to find a
> gss-proxy from a different container.
> 

Exactly. So there is no namespace context to capture in the RPC client
other than what's already in knfsd.
The RPC client doesn't capture any other process context. It can cache
a user cred in order to capture the user namespace, but that
information appears to be unused by this gssd RPC client.

So I'll repeat my question: Why can't we set this gssd RPC client up at
knfsd startup time, and tear it down when knfsd is shut down?

-- 
Trond Myklebust
Linux NFS client maintainer, Hammerspace
trond.myklebust@hammerspace.com



^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH net 2/2] auth_gss: Fix deadlock that blocks rpcsec_gss_exit_net when use-gss-proxy==1
  2021-09-28 14:27           ` Trond Myklebust
@ 2021-09-28 14:57             ` bfields
  2021-09-28 15:36               ` Trond Myklebust
  0 siblings, 1 reply; 16+ messages in thread
From: bfields @ 2021-09-28 14:57 UTC (permalink / raw)
  To: Trond Myklebust
  Cc: neilb, tom, Rao.Shoaib, willy, tyhicks, davem, wanghai38,
	nicolas.dichtel, linux-kernel, edumazet, ast, jlayton,
	christian.brauner, chuck.lever, linux-nfs, viro, anna.schumaker,
	kuba, cong.wang, dsahern, timo, jiang.wang, kuniyu, netdev,
	wenbin.zeng, kolga, jakub.kicinski

On Tue, Sep 28, 2021 at 02:27:33PM +0000, Trond Myklebust wrote:
> On Tue, 2021-09-28 at 10:17 -0400, bfields@fieldses.org wrote:
> > On Tue, Sep 28, 2021 at 02:04:49PM +0000, Trond Myklebust wrote:
> > > On Tue, 2021-09-28 at 09:49 -0400, bfields@fieldses.org wrote:
> > > > On Tue, Sep 28, 2021 at 01:30:17PM +0000, Trond Myklebust wrote:
> > > > > On Tue, 2021-09-28 at 11:14 +0800, Wang Hai wrote:
> > > > > > When use-gss-proxy is set to 1, write_gssp() creates a rpc
> > > > > > client
> > > > > > in
> > > > > > gssp_rpc_create(), this increases the netns refcount by 2,
> > > > > > these
> > > > > > refcounts are supposed to be released in
> > > > > > rpcsec_gss_exit_net(),
> > > > > > but
> > > > > > it
> > > > > > will never happen because rpcsec_gss_exit_net() is triggered
> > > > > > only
> > > > > > when
> > > > > > the netns refcount gets to 0, specifically:
> > > > > >     refcount=0 -> cleanup_net() -> ops_exit_list ->
> > > > > > rpcsec_gss_exit_net
> > > > > > It is a deadlock situation here, refcount will never get to 0
> > > > > > unless
> > > > > > rpcsec_gss_exit_net() is called. So, in this case, the netns
> > > > > > refcount
> > > > > > should not be increased.
> > > > > > 
> > > > > > In this case, xprt will take a netns refcount which is not
> > > > > > supposed
> > > > > > to be taken. Add a new flag to rpc_create_args called
> > > > > > RPC_CLNT_CREATE_NO_NET_REF for not increasing the netns
> > > > > > refcount.
> > > > > > 
> > > > > > It is safe not to hold the netns refcount, because when
> > > > > > cleanup_net(), it
> > > > > > will hold the gssp_lock and then shut down the rpc client
> > > > > > synchronously.
> > > > > > 
> > > > > > 
> > > > > I don't like this solution at all. Adding this kind of flag is
> > > > > going to
> > > > > lead to problems down the road.
> > > > > 
> > > > > Is there any reason whatsoever why we need this RPC client to
> > > > > exist
> > > > > when there is no active knfsd server? IOW: Is there any reason
> > > > > why
> > > > > we
> > > > > shouldn't defer creating this RPC client for when knfsd starts
> > > > > up
> > > > > in
> > > > > this net namespace, and why we can't shut it down when knfsd
> > > > > shuts
> > > > > down?
> > > > 
> > > > The rpc create is done in the context of the process that writes
> > > > to
> > > > /proc/net/rpc/use-gss-proxy to get the right namespaces.  I don't
> > > > know
> > > > how hard it would be capture that information for a later create.
> > > > 
> > > 
> > > svcauth_gss_proxy_init() uses the net namespace SVC_NET(rqstp)
> > > (i.e.
> > > the knfsd namespace) in the call to
> > > gssp_accept_sec_context_upcall().
> > > 
> > > IOW: the net namespace used in the call to find the RPC client is
> > > the
> > > one set up by knfsd, and so if use-gss-proxy was set in a different
> > > namespace than the one used by knfsd, then it won't be found.
> > 
> > Right.  If you've got multiple containers, you don't want to find a
> > gss-proxy from a different container.
> > 
> 
> Exactly. So there is no namespace context to capture in the RPC client
> other than what's already in knfsd.
>
> The RPC client doesn't capture any other process context. It can cache
> a user cred in order to capture the user namespace, but that
> information appears to be unused by this gssd RPC client.

OK, that's good to know, thanks.

It's doing a path lookup (it uses an AF_LOCAL socket), and I'm not
assuming that will get the same result across containers.  Is there an
easy way to do just that path lookup here and delay the res till knfsd
startup?

--b.

> 
> So I'll repeat my question: Why can't we set this gssd RPC client up at
> knfsd startup time, and tear it down when knfsd is shut down?
> 
> -- 
> Trond Myklebust
> Linux NFS client maintainer, Hammerspace
> trond.myklebust@hammerspace.com
> 
> 

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH net 2/2] auth_gss: Fix deadlock that blocks rpcsec_gss_exit_net when use-gss-proxy==1
  2021-09-28 14:57             ` bfields
@ 2021-09-28 15:36               ` Trond Myklebust
  2021-09-28 15:43                 ` bfields
  0 siblings, 1 reply; 16+ messages in thread
From: Trond Myklebust @ 2021-09-28 15:36 UTC (permalink / raw)
  To: bfields
  Cc: neilb, jakub.kicinski, willy, tyhicks, davem, wanghai38,
	nicolas.dichtel, linux-kernel, edumazet, jlayton, ast,
	christian.brauner, chuck.lever, linux-nfs, viro, anna.schumaker,
	tom, kuba, cong.wang, dsahern, timo, jiang.wang, kuniyu, netdev,
	Rao.Shoaib, wenbin.zeng, kolga

On Tue, 2021-09-28 at 10:57 -0400, bfields@fieldses.org wrote:
> On Tue, Sep 28, 2021 at 02:27:33PM +0000, Trond Myklebust wrote:
> > On Tue, 2021-09-28 at 10:17 -0400, bfields@fieldses.org wrote:
> > > On Tue, Sep 28, 2021 at 02:04:49PM +0000, Trond Myklebust wrote:
> > > > On Tue, 2021-09-28 at 09:49 -0400, bfields@fieldses.org wrote:
> > > > > On Tue, Sep 28, 2021 at 01:30:17PM +0000, Trond Myklebust
> > > > > wrote:
> > > > > > On Tue, 2021-09-28 at 11:14 +0800, Wang Hai wrote:
> > > > > > > When use-gss-proxy is set to 1, write_gssp() creates a
> > > > > > > rpc
> > > > > > > client
> > > > > > > in
> > > > > > > gssp_rpc_create(), this increases the netns refcount by
> > > > > > > 2,
> > > > > > > these
> > > > > > > refcounts are supposed to be released in
> > > > > > > rpcsec_gss_exit_net(),
> > > > > > > but
> > > > > > > it
> > > > > > > will never happen because rpcsec_gss_exit_net() is
> > > > > > > triggered
> > > > > > > only
> > > > > > > when
> > > > > > > the netns refcount gets to 0, specifically:
> > > > > > >     refcount=0 -> cleanup_net() -> ops_exit_list ->
> > > > > > > rpcsec_gss_exit_net
> > > > > > > It is a deadlock situation here, refcount will never get
> > > > > > > to 0
> > > > > > > unless
> > > > > > > rpcsec_gss_exit_net() is called. So, in this case, the
> > > > > > > netns
> > > > > > > refcount
> > > > > > > should not be increased.
> > > > > > > 
> > > > > > > In this case, xprt will take a netns refcount which is
> > > > > > > not
> > > > > > > supposed
> > > > > > > to be taken. Add a new flag to rpc_create_args called
> > > > > > > RPC_CLNT_CREATE_NO_NET_REF for not increasing the netns
> > > > > > > refcount.
> > > > > > > 
> > > > > > > It is safe not to hold the netns refcount, because when
> > > > > > > cleanup_net(), it
> > > > > > > will hold the gssp_lock and then shut down the rpc client
> > > > > > > synchronously.
> > > > > > > 
> > > > > > > 
> > > > > > I don't like this solution at all. Adding this kind of flag
> > > > > > is
> > > > > > going to
> > > > > > lead to problems down the road.
> > > > > > 
> > > > > > Is there any reason whatsoever why we need this RPC client
> > > > > > to
> > > > > > exist
> > > > > > when there is no active knfsd server? IOW: Is there any
> > > > > > reason
> > > > > > why
> > > > > > we
> > > > > > shouldn't defer creating this RPC client for when knfsd
> > > > > > starts
> > > > > > up
> > > > > > in
> > > > > > this net namespace, and why we can't shut it down when
> > > > > > knfsd
> > > > > > shuts
> > > > > > down?
> > > > > 
> > > > > The rpc create is done in the context of the process that
> > > > > writes
> > > > > to
> > > > > /proc/net/rpc/use-gss-proxy to get the right namespaces.  I
> > > > > don't
> > > > > know
> > > > > how hard it would be capture that information for a later
> > > > > create.
> > > > > 
> > > > 
> > > > svcauth_gss_proxy_init() uses the net namespace SVC_NET(rqstp)
> > > > (i.e.
> > > > the knfsd namespace) in the call to
> > > > gssp_accept_sec_context_upcall().
> > > > 
> > > > IOW: the net namespace used in the call to find the RPC client
> > > > is
> > > > the
> > > > one set up by knfsd, and so if use-gss-proxy was set in a
> > > > different
> > > > namespace than the one used by knfsd, then it won't be found.
> > > 
> > > Right.  If you've got multiple containers, you don't want to find
> > > a
> > > gss-proxy from a different container.
> > > 
> > 
> > Exactly. So there is no namespace context to capture in the RPC
> > client
> > other than what's already in knfsd.
> > 
> > The RPC client doesn't capture any other process context. It can
> > cache
> > a user cred in order to capture the user namespace, but that
> > information appears to be unused by this gssd RPC client.
> 
> OK, that's good to know, thanks.
> 
> It's doing a path lookup (it uses an AF_LOCAL socket), and I'm not
> assuming that will get the same result across containers.  Is there
> an
> easy way to do just that path lookup here and delay the res till
> knfsd
> startup?
> 

What is the use case here? Starting the gssd daemon or knfsd in
separate chrooted environments? We already know that they have to be
started in the same net namespace, which pretty much ensures it has to
be the same container.

-- 
Trond Myklebust
Linux NFS client maintainer, Hammerspace
trond.myklebust@hammerspace.com



^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH net 2/2] auth_gss: Fix deadlock that blocks rpcsec_gss_exit_net when use-gss-proxy==1
  2021-09-28 15:36               ` Trond Myklebust
@ 2021-09-28 15:43                 ` bfields
  2021-09-29 21:12                   ` bfields
  0 siblings, 1 reply; 16+ messages in thread
From: bfields @ 2021-09-28 15:43 UTC (permalink / raw)
  To: Trond Myklebust
  Cc: neilb, jakub.kicinski, willy, tyhicks, davem, wanghai38,
	nicolas.dichtel, linux-kernel, edumazet, jlayton, ast,
	christian.brauner, chuck.lever, linux-nfs, viro, anna.schumaker,
	tom, kuba, cong.wang, dsahern, timo, jiang.wang, kuniyu, netdev,
	Rao.Shoaib, wenbin.zeng, kolga

On Tue, Sep 28, 2021 at 03:36:58PM +0000, Trond Myklebust wrote:
> What is the use case here? Starting the gssd daemon or knfsd in
> separate chrooted environments? We already know that they have to be
> started in the same net namespace, which pretty much ensures it has to
> be the same container.

Somehow I forgot that knfsd startup is happening in some real process's
context too (not just a kthread).

OK, great, I agree, that sounds like it should work.

--b.

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH net 2/2] auth_gss: Fix deadlock that blocks rpcsec_gss_exit_net when use-gss-proxy==1
  2021-09-28 15:43                 ` bfields
@ 2021-09-29 21:12                   ` bfields
  2021-09-30  1:56                     ` wanghai (M)
  0 siblings, 1 reply; 16+ messages in thread
From: bfields @ 2021-09-29 21:12 UTC (permalink / raw)
  To: Trond Myklebust
  Cc: neilb, jakub.kicinski, willy, tyhicks, davem, wanghai38,
	nicolas.dichtel, linux-kernel, edumazet, jlayton, ast,
	christian.brauner, chuck.lever, linux-nfs, viro, anna.schumaker,
	tom, kuba, cong.wang, dsahern, timo, jiang.wang, kuniyu, netdev,
	Rao.Shoaib, wenbin.zeng, kolga

On Tue, Sep 28, 2021 at 11:43:00AM -0400, bfields@fieldses.org wrote:
> On Tue, Sep 28, 2021 at 03:36:58PM +0000, Trond Myklebust wrote:
> > What is the use case here? Starting the gssd daemon or knfsd in
> > separate chrooted environments? We already know that they have to be
> > started in the same net namespace, which pretty much ensures it has to
> > be the same container.
> 
> Somehow I forgot that knfsd startup is happening in some real process's
> context too (not just a kthread).
> 
> OK, great, I agree, that sounds like it should work.

Wang Hai, do you want to try that, or should I?

--b.

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH net 2/2] auth_gss: Fix deadlock that blocks rpcsec_gss_exit_net when use-gss-proxy==1
  2021-09-29 21:12                   ` bfields
@ 2021-09-30  1:56                     ` wanghai (M)
  2021-11-09 17:21                       ` bfields
  0 siblings, 1 reply; 16+ messages in thread
From: wanghai (M) @ 2021-09-30  1:56 UTC (permalink / raw)
  To: bfields
  Cc: Trond Myklebust, neilb, jakub.kicinski, willy, tyhicks, davem,
	nicolas.dichtel, linux-kernel, edumazet, jlayton, ast,
	christian.brauner, chuck.lever, linux-nfs, viro, anna.schumaker,
	tom, kuba, cong.wang, dsahern, timo, jiang.wang, kuniyu, netdev,
	Rao.Shoaib, wenbin.zeng, kolga


在 2021/9/30 5:12, bfields@fieldses.org 写道:
> On Tue, Sep 28, 2021 at 11:43:00AM -0400, bfields@fieldses.org wrote:
>> On Tue, Sep 28, 2021 at 03:36:58PM +0000, Trond Myklebust wrote:
>>> What is the use case here? Starting the gssd daemon or knfsd in
>>> separate chrooted environments? We already know that they have to be
>>> started in the same net namespace, which pretty much ensures it has to
>>> be the same container.
>> Somehow I forgot that knfsd startup is happening in some real process's
>> context too (not just a kthread).
>>
>> OK, great, I agree, that sounds like it should work.
> Wang Hai, do you want to try that, or should I?
>
> --b.
> .
Thank you, of course with great pleasure. I tried the solution
suggested by Myklebust yesterday, but I can't seem to get this
done very well. It would be a great pleasure for me if you could
help to finish it. I can help test it after you finish it.

--
Wang Hai

>

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH net 2/2] auth_gss: Fix deadlock that blocks rpcsec_gss_exit_net when use-gss-proxy==1
  2021-09-30  1:56                     ` wanghai (M)
@ 2021-11-09 17:21                       ` bfields
  2021-11-17 19:19                         ` bfields
  0 siblings, 1 reply; 16+ messages in thread
From: bfields @ 2021-11-09 17:21 UTC (permalink / raw)
  To: wanghai (M)
  Cc: Trond Myklebust, neilb, jakub.kicinski, willy, tyhicks, davem,
	nicolas.dichtel, linux-kernel, edumazet, jlayton, ast,
	christian.brauner, chuck.lever, linux-nfs, viro, anna.schumaker,
	tom, kuba, cong.wang, dsahern, timo, jiang.wang, kuniyu, netdev,
	Rao.Shoaib, wenbin.zeng, kolga

On Thu, Sep 30, 2021 at 09:56:03AM +0800, wanghai (M) wrote:
> 
> 在 2021/9/30 5:12, bfields@fieldses.org 写道:
> >On Tue, Sep 28, 2021 at 11:43:00AM -0400, bfields@fieldses.org wrote:
> >>On Tue, Sep 28, 2021 at 03:36:58PM +0000, Trond Myklebust wrote:
> >>>What is the use case here? Starting the gssd daemon or knfsd in
> >>>separate chrooted environments? We already know that they have to be
> >>>started in the same net namespace, which pretty much ensures it has to
> >>>be the same container.
> >>Somehow I forgot that knfsd startup is happening in some real process's
> >>context too (not just a kthread).
> >>
> >>OK, great, I agree, that sounds like it should work.

Ugh, took me a while to get back to this and I went down a couple dead
ends.

The result from selinux's point of view is that rpc.nfsd is doing things
it previously only expected gssproxy to do.  Fixable with an update to
selinux policy.  And easily fixed in the meantime by cut-and-pasting the
suggestions from the logs.

Still, the result's that mounts fail when you update the kernel, which
seems a violation of our usual rules about regressions.  I'd like to do
better.

--b.

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH net 2/2] auth_gss: Fix deadlock that blocks rpcsec_gss_exit_net when use-gss-proxy==1
  2021-11-09 17:21                       ` bfields
@ 2021-11-17 19:19                         ` bfields
  0 siblings, 0 replies; 16+ messages in thread
From: bfields @ 2021-11-17 19:19 UTC (permalink / raw)
  To: wanghai (M)
  Cc: Trond Myklebust, neilb, jakub.kicinski, willy, tyhicks, davem,
	nicolas.dichtel, linux-kernel, edumazet, jlayton, ast,
	christian.brauner, chuck.lever, linux-nfs, viro, anna.schumaker,
	tom, kuba, cong.wang, dsahern, timo, jiang.wang, kuniyu, netdev,
	Rao.Shoaib, wenbin.zeng, kolga

On Tue, Nov 09, 2021 at 12:21:11PM -0500, bfields@fieldses.org wrote:
> On Thu, Sep 30, 2021 at 09:56:03AM +0800, wanghai (M) wrote:
> > 
> > 在 2021/9/30 5:12, bfields@fieldses.org 写道:
> > >On Tue, Sep 28, 2021 at 11:43:00AM -0400, bfields@fieldses.org wrote:
> > >>On Tue, Sep 28, 2021 at 03:36:58PM +0000, Trond Myklebust wrote:
> > >>>What is the use case here? Starting the gssd daemon or knfsd in
> > >>>separate chrooted environments? We already know that they have to be
> > >>>started in the same net namespace, which pretty much ensures it has to
> > >>>be the same container.
> > >>Somehow I forgot that knfsd startup is happening in some real process's
> > >>context too (not just a kthread).
> > >>
> > >>OK, great, I agree, that sounds like it should work.
> 
> Ugh, took me a while to get back to this and I went down a couple dead
> ends.
> 
> The result from selinux's point of view is that rpc.nfsd is doing things
> it previously only expected gssproxy to do.  Fixable with an update to
> selinux policy.  And easily fixed in the meantime by cut-and-pasting the
> suggestions from the logs.
> 
> Still, the result's that mounts fail when you update the kernel, which
> seems a violation of our usual rules about regressions.  I'd like to do
> better.

So, I'm not applying this, but here's the patch, for what it's worth.

I'm not sure what the alternative is.  Do we get a chance to do
something when gssproxy closes the connection, maybe?

--b.

commit 9fc4ae28ec95
Author: J. Bruce Fields <bfields@redhat.com>
Date:   Thu Nov 4 16:55:28 2021 -0400

    nfsd: connect to gssp-proxy on nfsd start
    
    We communicate with gss-proxy using by rpc over a local unix socket.
    The rpc client is set up in the context of the process that writes to
    /proc/net/rpc/use-gss-proxy.  Unfortunately that leaves us with no clear
    place to shut down that client; we've been trying to shut it down when
    the network namespace is destroyed, but the rpc client holds a reference
    on the network namespace, so that never happens.
    
    We do need to create the client in the context of a process that's
    likely to be in the correct namespace.  We can use rpc.nfsd instead.  In
    particular, we use create the rpc client when sockets are added to an
    rpc server.
    
    Signed-off-by: J. Bruce Fields <bfields@redhat.com>

diff --git a/include/linux/sunrpc/svcauth.h b/include/linux/sunrpc/svcauth.h
index 6d9cc9080aca..b60ecb51511d 100644
--- a/include/linux/sunrpc/svcauth.h
+++ b/include/linux/sunrpc/svcauth.h
@@ -183,4 +183,12 @@ static inline unsigned long hash_mem(char const *buf, int length, int bits)
 	return full_name_hash(NULL, buf, length) >> (32 - bits);
 }
 
+struct gssp_clnt_ops {
+	struct module *owner;
+	int (*constructor)(struct net*);
+	void (*destructor)(struct net*);
+};
+
+extern struct gssp_clnt_ops gpops;
+
 #endif /* _LINUX_SUNRPC_SVCAUTH_H_ */
diff --git a/net/sunrpc/auth_gss/gss_rpc_upcall.c b/net/sunrpc/auth_gss/gss_rpc_upcall.c
index 61c276bddaf2..04a311305b26 100644
--- a/net/sunrpc/auth_gss/gss_rpc_upcall.c
+++ b/net/sunrpc/auth_gss/gss_rpc_upcall.c
@@ -122,7 +122,6 @@ static int gssp_rpc_create(struct net *net, struct rpc_clnt **_clnt)
 
 void init_gssp_clnt(struct sunrpc_net *sn)
 {
-	mutex_init(&sn->gssp_lock);
 	sn->gssp_clnt = NULL;
 }
 
@@ -132,25 +131,23 @@ int set_gssp_clnt(struct net *net)
 	struct rpc_clnt *clnt;
 	int ret;
 
-	mutex_lock(&sn->gssp_lock);
-	ret = gssp_rpc_create(net, &clnt);
-	if (!ret) {
-		if (sn->gssp_clnt)
-			rpc_shutdown_client(sn->gssp_clnt);
+	if (!sn->gssp_clnt) {
+		ret = gssp_rpc_create(net, &clnt);
+		if (ret)
+			return ret;
 		sn->gssp_clnt = clnt;
 	}
-	mutex_unlock(&sn->gssp_lock);
-	return ret;
+	return 0;
 }
 
-void clear_gssp_clnt(struct sunrpc_net *sn)
+void clear_gssp_clnt(struct net *net)
 {
-	mutex_lock(&sn->gssp_lock);
+	struct sunrpc_net *sn = net_generic(net, sunrpc_net_id);
+
 	if (sn->gssp_clnt) {
 		rpc_shutdown_client(sn->gssp_clnt);
 		sn->gssp_clnt = NULL;
 	}
-	mutex_unlock(&sn->gssp_lock);
 }
 
 static struct rpc_clnt *get_gssp_clnt(struct sunrpc_net *sn)
diff --git a/net/sunrpc/auth_gss/gss_rpc_upcall.h b/net/sunrpc/auth_gss/gss_rpc_upcall.h
index 31e96344167e..fd70f4fb56a9 100644
--- a/net/sunrpc/auth_gss/gss_rpc_upcall.h
+++ b/net/sunrpc/auth_gss/gss_rpc_upcall.h
@@ -31,6 +31,6 @@ void gssp_free_upcall_data(struct gssp_upcall_data *data);
 
 void init_gssp_clnt(struct sunrpc_net *);
 int set_gssp_clnt(struct net *);
-void clear_gssp_clnt(struct sunrpc_net *);
+void clear_gssp_clnt(struct net *);
 
 #endif /* _GSS_RPC_UPCALL_H */
diff --git a/net/sunrpc/auth_gss/svcauth_gss.c b/net/sunrpc/auth_gss/svcauth_gss.c
index 7dba6a9c213a..79212437558f 100644
--- a/net/sunrpc/auth_gss/svcauth_gss.c
+++ b/net/sunrpc/auth_gss/svcauth_gss.c
@@ -1449,9 +1449,6 @@ static ssize_t write_gssp(struct file *file, const char __user *buf,
 		return res;
 	if (i != 1)
 		return -EINVAL;
-	res = set_gssp_clnt(net);
-	if (res)
-		return res;
 	res = set_gss_proxy(net, 1);
 	if (res)
 		return res;
@@ -1505,10 +1502,8 @@ static void destroy_use_gss_proxy_proc_entry(struct net *net)
 {
 	struct sunrpc_net *sn = net_generic(net, sunrpc_net_id);
 
-	if (sn->use_gssp_proc) {
+	if (sn->use_gssp_proc)
 		remove_proc_entry("use-gss-proxy", sn->proc_net_rpc);
-		clear_gssp_clnt(sn);
-	}
 }
 #else /* CONFIG_PROC_FS */
 
@@ -1999,9 +1994,16 @@ gss_svc_shutdown_net(struct net *net)
 	rsc_cache_destroy_net(net);
 }
 
+struct gssp_clnt_ops mygpops = {
+	.owner = THIS_MODULE,
+	.constructor = set_gssp_clnt,
+	.destructor = clear_gssp_clnt,
+};
+
 int
 gss_svc_init(void)
 {
+	gpops = mygpops;
 	return svc_auth_register(RPC_AUTH_GSS, &svcauthops_gss);
 }
 
@@ -2009,4 +2011,5 @@ void
 gss_svc_shutdown(void)
 {
 	svc_auth_unregister(RPC_AUTH_GSS);
+	gpops = mygpops;
 }
diff --git a/net/sunrpc/netns.h b/net/sunrpc/netns.h
index 7ec10b92bea1..7d8653b3d81b 100644
--- a/net/sunrpc/netns.h
+++ b/net/sunrpc/netns.h
@@ -28,6 +28,7 @@ struct sunrpc_net {
 	unsigned int rpcb_is_af_local : 1;
 
 	struct mutex gssp_lock;
+	int gssp_users;
 	struct rpc_clnt *gssp_clnt;
 	int use_gss_proxy;
 	int pipe_version;
diff --git a/net/sunrpc/sunrpc_syms.c b/net/sunrpc/sunrpc_syms.c
index 691c0000e9ea..463c975151d7 100644
--- a/net/sunrpc/sunrpc_syms.c
+++ b/net/sunrpc/sunrpc_syms.c
@@ -54,6 +54,8 @@ static __net_init int sunrpc_init_net(struct net *net)
 	INIT_LIST_HEAD(&sn->all_clients);
 	spin_lock_init(&sn->rpc_client_lock);
 	spin_lock_init(&sn->rpcb_clnt_lock);
+	mutex_init(&sn->gssp_lock);
+	sn->gssp_users = 0;
 	return 0;
 
 err_pipefs:
diff --git a/net/sunrpc/svc_xprt.c b/net/sunrpc/svc_xprt.c
index 1e99ba1b9d72..30d9a9779093 100644
--- a/net/sunrpc/svc_xprt.c
+++ b/net/sunrpc/svc_xprt.c
@@ -19,6 +19,7 @@
 #include <linux/module.h>
 #include <linux/netdevice.h>
 #include <trace/events/sunrpc.h>
+#include "netns.h"
 
 #define RPCDBG_FACILITY	RPCDBG_SVCXPRT
 
@@ -322,6 +323,37 @@ static int _svc_create_xprt(struct svc_serv *serv, const char *xprt_name,
 	return -EPROTONOSUPPORT;
 }
 
+struct gssp_clnt_ops gpops = {};
+EXPORT_SYMBOL_GPL(gpops);
+
+int get_gssp_clnt(struct net *net)
+{
+	struct sunrpc_net *sn = net_generic(net, sunrpc_net_id);
+	int ret;
+
+	mutex_lock(&sn->gssp_lock);
+
+	if (try_module_get(gpops.owner) && gpops.constructor(net)) {
+		ret = gpops.constructor(net);
+		module_put(gpops.owner);
+	}
+	sn->gssp_users++;
+	mutex_unlock(&sn->gssp_lock);
+
+	return ret;
+}
+
+void put_gssp_clnt(struct net *net)
+{
+	struct sunrpc_net *sn = net_generic(net, sunrpc_net_id);
+
+	mutex_lock(&sn->gssp_lock);
+	sn->gssp_users--;
+	if (!sn->gssp_users && sn->gssp_clnt)
+		gpops.destructor(net);
+	mutex_unlock(&sn->gssp_lock);
+}
+
 int svc_create_xprt(struct svc_serv *serv, const char *xprt_name,
 		    struct net *net, const int family,
 		    const unsigned short port, int flags,
@@ -329,11 +361,15 @@ int svc_create_xprt(struct svc_serv *serv, const char *xprt_name,
 {
 	int err;
 
+	get_gssp_clnt(net);
+
 	err = _svc_create_xprt(serv, xprt_name, net, family, port, flags, cred);
 	if (err == -EPROTONOSUPPORT) {
 		request_module("svc%s", xprt_name);
 		err = _svc_create_xprt(serv, xprt_name, net, family, port, flags, cred);
 	}
+	if (err < 0)
+		put_gssp_clnt(net);
 	return err;
 }
 EXPORT_SYMBOL_GPL(svc_create_xprt);
@@ -1038,6 +1074,7 @@ static void svc_delete_xprt(struct svc_xprt *xprt)
 {
 	struct svc_serv	*serv = xprt->xpt_server;
 	struct svc_deferred_req *dr;
+	struct net *net = xprt->xpt_net;
 
 	if (test_and_set_bit(XPT_DEAD, &xprt->xpt_flags))
 		return;
@@ -1058,6 +1095,8 @@ static void svc_delete_xprt(struct svc_xprt *xprt)
 		kfree(dr);
 
 	call_xpt_users(xprt);
+	if (!test_bit(XPT_TEMP, &xprt->xpt_flags))
+		put_gssp_clnt(net);
 	svc_xprt_put(xprt);
 }
 

^ permalink raw reply related	[flat|nested] 16+ messages in thread

end of thread, other threads:[~2021-11-17 19:19 UTC | newest]

Thread overview: 16+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-09-28  3:14 [PATCH net 0/2] auth_gss: Fix netns refcount leaks when use-gss-proxy==1 Wang Hai
2021-09-28  3:14 ` [PATCH net 1/2] net: Modify unix_stream_connect to not reference count the netns of kernel sockets Wang Hai
2021-09-28 12:50   ` Kuniyuki Iwashima
2021-09-28  3:14 ` [PATCH net 2/2] auth_gss: Fix deadlock that blocks rpcsec_gss_exit_net when use-gss-proxy==1 Wang Hai
2021-09-28 13:30   ` Trond Myklebust
2021-09-28 13:49     ` bfields
2021-09-28 14:04       ` Trond Myklebust
2021-09-28 14:17         ` bfields
2021-09-28 14:27           ` Trond Myklebust
2021-09-28 14:57             ` bfields
2021-09-28 15:36               ` Trond Myklebust
2021-09-28 15:43                 ` bfields
2021-09-29 21:12                   ` bfields
2021-09-30  1:56                     ` wanghai (M)
2021-11-09 17:21                       ` bfields
2021-11-17 19:19                         ` bfields

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).