* Re: [Qestion] Lots of memory leaks when mounting and unmounting nfs client to server continuously. [not found] ` <1DEE371C-69EB-4D92-8F78-535AA5203007@redhat.com> @ 2018-10-30 13:58 ` zhong jiang 2018-10-30 14:03 ` Benjamin Coddington 2018-11-07 19:49 ` Dave Wysochanski 0 siblings, 2 replies; 6+ messages in thread From: zhong jiang @ 2018-10-30 13:58 UTC (permalink / raw) To: Benjamin Coddington, herbert, trond.myklebust, bfields Cc: linux-crypto, LKML, linux-nfs On 2018/10/30 21:06, Benjamin Coddington wrote: > Hi zhong jiang, > > Try asking in linux-nfs.. but I'll also note that 3.10-stable may be missing a number of fixes to leaks in the NFS GSS code. > > I can see a more than a few fixes to memory leaks with: > git log --grep=leak --oneline net/sunrpc/auth_gss/ > Thanks for your reply. I has tested some of them in the upsteam as you have said. but It fails to solve the issue completely. hence, I turn to the relevant experts whether they have happened to the issue or can give some suggestion or not. Thanks, zhong jiang > Ben > > On 30 Oct 2018, at 8:45, zhong jiang wrote: > >> Hi, Herbert >> >> Recently, I hit a memory leak issue when mounting and unmounting nfs with the way of krb5. >> The issue happens to the linux-3.10-stable. >> >> I find that slab-1024 and slab-512 will take up most of the memory. And it can not be freed. >> Meanwhile, it result in rpcsec_gss_krb5 can be unregistered as well. >> >> nfs-sve1:/home # cat /proc/modules | grep krb5 >> rpcsec_gss_krb5 31477 239730 - Live 0xffffffffa0334000 >> auth_rpcgss 59314 3 rpcsec_gss_krb5,nfsd, Live 0xffffffffa0123000 >> sunrpc 300546 25 rpcsec_gss_krb5,nfsd,auth_rpcgss,nfs_acl,lockd, Live 0xffffffffa013b000 >> >> I open the slab-1024 trace by enabling /sys/kernel/slab/:t-0001024/trace and get the following >> >> [123420.989831] Call Trace: >> [123420.989834] [<ffffffff81642d2a>] dump_stack+0x19/0x1b >> [123420.989837] [<ffffffff8163f25e>] alloc_debug_processing+0xc5/0x118 >> [123420.989839] [<ffffffff8163fd4d>] __slab_alloc+0x400/0x48f >> [123420.989841] [<ffffffff812b1795>] ? __crypto_alloc_tfm+0x45/0x170 >> [123420.989845] [<ffffffff812b2307>] ? setkey+0x57/0x110 >> [123420.989847] [<ffffffff8118b5fd>] ? kzfree+0x2d/0x30 >> [123420.989850] [<ffffffff811c6e88>] __kmalloc+0x1c8/0x230 >> [123420.989852] [<ffffffff812b1795>] __crypto_alloc_tfm+0x45/0x170 >> [123420.989854] [<ffffffff812b2e45>] crypto_spawn_tfm+0x45/0x80 >> [123420.989857] [<ffffffff811c6eb3>] ? __kmalloc+0x1f3/0x230 >> [123420.989859] [<ffffffff812c15c7>] crypto_cbc_init_tfm+0x27/0x40 >> [123420.989864] [<ffffffff812b1851>] __crypto_alloc_tfm+0x101/0x170 >> [123420.989866] [<ffffffff812b1ffc>] crypto_alloc_base+0x4c/0xb0 >> [123420.989869] [<ffffffffa033411b>] context_v2_alloc_cipher.isra.2+0x2b/0xc0 [rpcsec_gss_krb5] >> [123420.989871] [<ffffffffa0334da8>] gss_import_sec_context_kerberos+0xbf8/0xf00 [rpcsec_gss_krb5] >> [123420.989875] [<ffffffffa0126d5d>] gss_import_sec_context+0x7d/0xb0 [auth_rpcgss] >> [123420.989878] [<ffffffffa012b35e>] gss_proxy_save_rsc+0x137/0x1b0 [auth_rpcgss] >> [123420.989884] [<ffffffffa012b51e>] svcauth_gss_proxy_init+0x147/0x1e4 [auth_rpcgss] >> [123420.989886] [<ffffffff810c2ad6>] ? dequeue_entity+0x106/0x520 >> [123420.989890] [<ffffffffa0128e2a>] svcauth_gss_accept+0x3da/0xb70 [auth_rpcgss] >> [123420.989892] [<ffffffff810b6c25>] ? check_preempt_curr+0x85/0xa0 >> [123420.989894] [<ffffffff810b6c59>] ? ttwu_do_wakeup+0x19/0xd0 >> [123420.989897] [<ffffffff810b6ded>] ? ttwu_do_activate.constprop.86+0x5d/0x70 >> [123420.989900] [<ffffffff810b9422>] ? try_to_wake_up+0x162/0x330 >> [123420.989908] [<ffffffffa014f490>] svc_authenticate+0xc0/0xe0 [sunrpc] >> [123420.989914] [<ffffffffa014c04a>] svc_process_common+0x21a/0x6f0 [sunrpc] >> [123420.989921] [<ffffffffa014c623>] svc_process+0x103/0x170 [sunrpc] >> [123420.989928] [<ffffffffa01baaaf>] nfsd+0xdf/0x150 [nfsd] >> [123420.989932] [<ffffffffa01ba9d0>] ? nfsd_destroy+0x80/0x80 [nfsd] >> [123420.989934] [<ffffffff810a648f>] kthread+0xcf/0xe0 >> [123420.989936] [<ffffffff810a63c0>] ? kthread_create_on_node+0x140/0x140 >> [123420.989939] [<ffffffff81653318>] ret_from_fork+0x58/0x90 >> [123420.989943] [<ffffffff810a63c0>] ? kthread_create_on_node+0x140/0x140 >> >> I am unfamiliar with crypto. I will be appreciated if you could give me some suggestion. >> >> Thanks, >> zhong jiang > > . > ^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [Qestion] Lots of memory leaks when mounting and unmounting nfs client to server continuously. 2018-10-30 13:58 ` [Qestion] Lots of memory leaks when mounting and unmounting nfs client to server continuously zhong jiang @ 2018-10-30 14:03 ` Benjamin Coddington 2018-10-30 14:29 ` zhong jiang 2018-11-01 14:18 ` zhong jiang 2018-11-07 19:49 ` Dave Wysochanski 1 sibling, 2 replies; 6+ messages in thread From: Benjamin Coddington @ 2018-10-30 14:03 UTC (permalink / raw) To: zhong jiang Cc: herbert, trond.myklebust, bfields, linux-crypto, LKML, linux-nfs On 30 Oct 2018, at 9:58, zhong jiang wrote: > On 2018/10/30 21:06, Benjamin Coddington wrote: >> Hi zhong jiang, >> >> Try asking in linux-nfs.. but I'll also note that 3.10-stable may be >> missing a number of fixes to leaks in the NFS GSS code. >> >> I can see a more than a few fixes to memory leaks with: git log >> --grep=leak --oneline net/sunrpc/auth_gss/ >> > Thanks for your reply. I has tested some of them in the upsteam as you > have said. but It fails to solve the issue completely. What have you tested? It is hard to help without specifics. ^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [Qestion] Lots of memory leaks when mounting and unmounting nfs client to server continuously. 2018-10-30 14:03 ` Benjamin Coddington @ 2018-10-30 14:29 ` zhong jiang 2018-11-01 14:18 ` zhong jiang 1 sibling, 0 replies; 6+ messages in thread From: zhong jiang @ 2018-10-30 14:29 UTC (permalink / raw) To: Benjamin Coddington Cc: herbert, trond.myklebust, bfields, linux-crypto, LKML, linux-nfs On 2018/10/30 22:03, Benjamin Coddington wrote: > On 30 Oct 2018, at 9:58, zhong jiang wrote: > >> On 2018/10/30 21:06, Benjamin Coddington wrote: >>> Hi zhong jiang, >>> >>> Try asking in linux-nfs.. but I'll also note that 3.10-stable may be >>> missing a number of fixes to leaks in the NFS GSS code. >>> >>> I can see a more than a few fixes to memory leaks with: git log >>> --grep=leak --oneline net/sunrpc/auth_gss/ >>> >> Thanks for your reply. I has tested some of them in the upsteam as you >> have said. but It fails to solve the issue completely. > What have you tested? It is hard to help without specifics. In the latest mainline. we can filter the following result by the key word "leak" in net/sunrpc/auth_gss. 0070ed3 Fix 16-byte memory leak in gssp_accept_sec_context_upcall (has been tested, Fail to work) 78794d1 svcrpc: don't leak contexts on PROC_DESTROY (has been tested, Fail to work) a1d1e9b svcrpc: fix memory leak in gssp_accept_sec_context_upcall (Not yet) e9776d0 SUNRPC: Fix a pipe_version reference leak (Not yet) cdead7c SUNRPC: Fix a potential memory leak in auth_gss (Not yet) 980e5a4 nfsd: fix rsi_cache reference count leak (Not yet) 07a2bf1 SUNRPC: Fix a memory leak in gss_create() (Not yet) 3ab9bb7 SUNRPC: Fix a memory leak in the auth credcache code (existed) 54f9247 knfsd: fix resource leak resulting in module refcount leak for rpcsec_gss_krb5.ko (existed) b797b5b [PATCH] knfsd: svcrpc: fix gss krb5i memory leak (existed) d4a30e7 RPCSEC_GSS: fix leak in krb5 code caused by superfluous kmalloc (Not yet) I suspect that commit d4a30e7 ("RPCSEC_GSS: fix leak in krb5 code caused by superfluous kmalloc") will solve the issue. Further, I will adjust the patch to 3.10. and see what it will happen. Actually I am not sure. :-[ Thanks, zhong jiang. ^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [Qestion] Lots of memory leaks when mounting and unmounting nfs client to server continuously. 2018-10-30 14:03 ` Benjamin Coddington 2018-10-30 14:29 ` zhong jiang @ 2018-11-01 14:18 ` zhong jiang 1 sibling, 0 replies; 6+ messages in thread From: zhong jiang @ 2018-11-01 14:18 UTC (permalink / raw) To: Benjamin Coddington Cc: herbert, trond.myklebust, bfields, linux-crypto, LKML, linux-nfs On 2018/10/30 22:03, Benjamin Coddington wrote: > On 30 Oct 2018, at 9:58, zhong jiang wrote: > >> On 2018/10/30 21:06, Benjamin Coddington wrote: >>> Hi zhong jiang, >>> >>> Try asking in linux-nfs.. but I'll also note that 3.10-stable may be >>> missing a number of fixes to leaks in the NFS GSS code. >>> >>> I can see a more than a few fixes to memory leaks with: git log >>> --grep=leak --oneline net/sunrpc/auth_gss/ >>> >> Thanks for your reply. I has tested some of them in the upsteam as you >> have said. but It fails to solve the issue completely. > What have you tested? It is hard to help without specifics. Hi, Benjamin I have tested all of the the following patches in the latest mainline. git log --grep=leak --oneline net/sunrpc/auth_gss/ Unfortunately, None of the patches works. Could you give some clues? Thanks, zhong jiang ^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [Qestion] Lots of memory leaks when mounting and unmounting nfs client to server continuously. 2018-10-30 13:58 ` [Qestion] Lots of memory leaks when mounting and unmounting nfs client to server continuously zhong jiang 2018-10-30 14:03 ` Benjamin Coddington @ 2018-11-07 19:49 ` Dave Wysochanski 2018-11-13 6:40 ` zhong jiang 1 sibling, 1 reply; 6+ messages in thread From: Dave Wysochanski @ 2018-11-07 19:49 UTC (permalink / raw) To: zhong jiang, Benjamin Coddington, herbert, trond.myklebust, bfields Cc: linux-crypto, LKML, linux-nfs On Tue, 2018-10-30 at 21:58 +0800, zhong jiang wrote: > On 2018/10/30 21:06, Benjamin Coddington wrote: > > Hi zhong jiang, > > > > Try asking in linux-nfs.. but I'll also note that 3.10-stable may > > be missing a number of fixes to leaks in the NFS GSS code. > > > > I can see a more than a few fixes to memory leaks with: > > git log --grep=leak --oneline net/sunrpc/auth_gss/ > > > > Thanks for your reply. I has tested some of them in the upsteam as > you have said. but It fails to solve the issue completely. > hence, I turn to the relevant experts whether they have happened to > the issue or can give some suggestion or not. > > Thanks, > zhong jiang > > Ben > > > > On 30 Oct 2018, at 8:45, zhong jiang wrote: > > > > > Hi, Herbert > > > > > > Recently, I hit a memory leak issue when mounting and > > > unmounting nfs with the way of krb5. > > > The issue happens to the linux-3.10-stable. > > > > > > I find that slab-1024 and slab-512 will take up most of the > > > memory. And it can not be freed. > > > Meanwhile, it result in rpcsec_gss_krb5 can be unregistered as > > > well. > > > > > > Are you running the latest 3.10-stable? This sounds very familiar to something I encountered a while ago and it was a sunrpc cache related problem. The patch that fixed it for me is in 3.10.106 though. Can you check if this cache is growing indefinitely? /proc/net/rpc/auth.rpcsec.context If it is large, try to flush explicitly with: date +%s > /proc/net/rpc/auth.rpcsec.context/flush If all that checks out, you may need the below upstream fix, but it went into v3.10.106 as 6a4a5fd svcrpc: don't leak contexts on PROC_DESTROY commit 6a4a5fd4c7bc6a06ca26ad7327d046d8d3c0932a Author: J. Bruce Fields <bfields@redhat.com> Date: Mon Jan 9 17:15:18 2017 -0500 svcrpc: don't leak contexts on PROC_DESTROY commit 78794d1890708cf94e3961261e52dcec2cc34722 upstream. Context expiry times are in units of seconds since boot, not unix time. The use of get_seconds() here therefore sets the expiry time decades in the future. This prevents timely freeing of contexts destroyed by client RPC_GSS_PROC_DESTROY requests. We'd still free them eventually (when the module is unloaded or the container shut down), but a lot of contexts could pile up before then. Fixes: c5b29f885afe "sunrpc: use seconds since boot in expiry cache" Reported-by: Andy Adamson <andros@netapp.com> Signed-off-by: J. Bruce Fields <bfields@redhat.com> Signed-off-by: Willy Tarreau <w@1wt.eu> diff --git a/net/sunrpc/auth_gss/svcauth_gss.c b/net/sunrpc/auth_gss/svcauth_gss.c index 62663a0..e625efe 100644 --- a/net/sunrpc/auth_gss/svcauth_gss.c +++ b/net/sunrpc/auth_gss/svcauth_gss.c @@ -1518,7 +1518,7 @@ static void destroy_use_gss_proxy_proc_entry(struct net *net) {} case RPC_GSS_PROC_DESTROY: if (gss_write_verf(rqstp, rsci->mechctx, gc->gc_seq)) goto auth_err; - rsci->h.expiry_time = get_seconds(); + rsci->h.expiry_time = seconds_since_boot(); set_bit(CACHE_NEGATIVE, &rsci->h.flags); if (resv->iov_len + 4 > PAGE_SIZE) goto drop; ^ permalink raw reply related [flat|nested] 6+ messages in thread
* Re: [Qestion] Lots of memory leaks when mounting and unmounting nfs client to server continuously. 2018-11-07 19:49 ` Dave Wysochanski @ 2018-11-13 6:40 ` zhong jiang 0 siblings, 0 replies; 6+ messages in thread From: zhong jiang @ 2018-11-13 6:40 UTC (permalink / raw) To: Dave Wysochanski Cc: Benjamin Coddington, herbert, trond.myklebust, bfields, linux-crypto, LKML, linux-nfs On 2018/11/8 3:49, Dave Wysochanski wrote: > On Tue, 2018-10-30 at 21:58 +0800, zhong jiang wrote: >> On 2018/10/30 21:06, Benjamin Coddington wrote: >>> Hi zhong jiang, >>> >>> Try asking in linux-nfs.. but I'll also note that 3.10-stable may >>> be missing a number of fixes to leaks in the NFS GSS code. >>> >>> I can see a more than a few fixes to memory leaks with: >>> git log --grep=leak --oneline net/sunrpc/auth_gss/ >>> >> Thanks for your reply. I has tested some of them in the upsteam as >> you have said. but It fails to solve the issue completely. >> hence, I turn to the relevant experts whether they have happened to >> the issue or can give some suggestion or not. >> >> Thanks, >> zhong jiang >>> Ben >>> >>> On 30 Oct 2018, at 8:45, zhong jiang wrote: >>> >>>> Hi, Herbert >>>> >>>> Recently, I hit a memory leak issue when mounting and >>>> unmounting nfs with the way of krb5. >>>> The issue happens to the linux-3.10-stable. >>>> >>>> I find that slab-1024 and slab-512 will take up most of the >>>> memory. And it can not be freed. >>>> Meanwhile, it result in rpcsec_gss_krb5 can be unregistered as >>>> well. >>>> >>>> > Are you running the latest 3.10-stable? > > This sounds very familiar to something I encountered a while ago and it > was a sunrpc cache related problem. The patch that fixed it for me is > in 3.10.106 though. > > Can you check if this cache is growing indefinitely? > /proc/net/rpc/auth.rpcsec.context > > If it is large, try to flush explicitly with: > date +%s > /proc/net/rpc/auth.rpcsec.context/flush > > If all that checks out, you may need the below upstream fix, but it > went into v3.10.106 as > 6a4a5fd svcrpc: don't leak contexts on PROC_DESTROY > > commit 6a4a5fd4c7bc6a06ca26ad7327d046d8d3c0932a > Author: J. Bruce Fields <bfields@redhat.com> > Date: Mon Jan 9 17:15:18 2017 -0500 > > svcrpc: don't leak contexts on PROC_DESTROY > > commit 78794d1890708cf94e3961261e52dcec2cc34722 upstream. > > Context expiry times are in units of seconds since boot, not unix time. > > The use of get_seconds() here therefore sets the expiry time decades in > the future. This prevents timely freeing of contexts destroyed by > client RPC_GSS_PROC_DESTROY requests. We'd still free them eventually > (when the module is unloaded or the container shut down), but a lot of > contexts could pile up before then. > > Fixes: c5b29f885afe "sunrpc: use seconds since boot in expiry cache" > Reported-by: Andy Adamson <andros@netapp.com> > Signed-off-by: J. Bruce Fields <bfields@redhat.com> > Signed-off-by: Willy Tarreau <w@1wt.eu> > > diff --git a/net/sunrpc/auth_gss/svcauth_gss.c b/net/sunrpc/auth_gss/svcauth_gss.c > index 62663a0..e625efe 100644 > --- a/net/sunrpc/auth_gss/svcauth_gss.c > +++ b/net/sunrpc/auth_gss/svcauth_gss.c > @@ -1518,7 +1518,7 @@ static void destroy_use_gss_proxy_proc_entry(struct net *net) {} > case RPC_GSS_PROC_DESTROY: > if (gss_write_verf(rqstp, rsci->mechctx, gc->gc_seq)) > goto auth_err; > - rsci->h.expiry_time = get_seconds(); > + rsci->h.expiry_time = seconds_since_boot(); > set_bit(CACHE_NEGATIVE, &rsci->h.flags); > if (resv->iov_len + 4 > PAGE_SIZE) > goto drop; > > . > Hi, Dave Thank you for kindly help and reply. and sorry for late reply. Because I just test the patch. It will not work thoroughly. but I unite the following three patches from upstream, the issue will not occur. 0070ed3 Fix 16-byte memory leak in gssp_accept_sec_context_upcall 78794d1 svcrpc: don't leak contexts on PROC_DESTROY a1d1e9b svcrpc: fix memory leak in gssp_accept_sec_context_upcall I think we should backport the relevant patches to stable-3.10. Thanks, zhong jiang ^ permalink raw reply [flat|nested] 6+ messages in thread
end of thread, other threads:[~2018-11-13 6:40 UTC | newest] Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- [not found] <5BD85266.6000301@huawei.com> [not found] ` <1DEE371C-69EB-4D92-8F78-535AA5203007@redhat.com> 2018-10-30 13:58 ` [Qestion] Lots of memory leaks when mounting and unmounting nfs client to server continuously zhong jiang 2018-10-30 14:03 ` Benjamin Coddington 2018-10-30 14:29 ` zhong jiang 2018-11-01 14:18 ` zhong jiang 2018-11-07 19:49 ` Dave Wysochanski 2018-11-13 6:40 ` zhong jiang
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).