* qedr memory leak report @ 2019-08-30 18:03 Chuck Lever 2019-08-30 18:27 ` Chuck Lever 0 siblings, 1 reply; 9+ messages in thread From: Chuck Lever @ 2019-08-30 18:03 UTC (permalink / raw) To: Michal Kalderon; +Cc: linux-rdma Hi Michal- In the middle of some other testing, I got this kmemleak report while testing with FastLinq cards in iWARP mode: unreferenced object 0xffff888458923340 (size 32): comm "mount.nfs", pid 2294, jiffies 4298338848 (age 1144.337s) hex dump (first 32 bytes): 20 1d 69 63 88 88 ff ff 20 1d 69 63 88 88 ff ff .ic.... .ic.... 00 60 7a 69 84 88 ff ff 00 60 82 f9 00 00 00 00 .`zi.....`...... backtrace: [<000000000df5bfed>] __kmalloc+0x128/0x176 [<0000000020724641>] qedr_alloc_pbl_tbl.constprop.44+0x3c/0x121 [qedr] [<00000000a361c591>] init_mr_info.constprop.41+0xaf/0x21f [qedr] [<00000000e8049714>] qedr_alloc_mr+0x95/0x2c1 [qedr] [<000000000e6102bc>] ib_alloc_mr_user+0x31/0x96 [ib_core] [<00000000d254a9fb>] frwr_init_mr+0x23/0x121 [rpcrdma] [<00000000a0364e35>] rpcrdma_mrs_create+0x45/0xea [rpcrdma] [<00000000fd6bf282>] rpcrdma_buffer_create+0x9e/0x1c9 [rpcrdma] [<00000000be3a1eba>] xprt_setup_rdma+0x109/0x279 [rpcrdma] [<00000000b736b88f>] xprt_create_transport+0x39/0x19a [sunrpc] [<000000001024e4dc>] rpc_create+0x118/0x1ab [sunrpc] [<00000000cca43a49>] nfs_create_rpc_client+0xf8/0x15f [nfs] [<00000000073c962c>] nfs_init_client+0x1a/0x3b [nfs] [<00000000b03964c4>] nfs_init_server+0xc1/0x212 [nfs] [<000000001c71f609>] nfs_create_server+0x74/0x1a4 [nfs] [<000000004dc919a1>] nfs3_create_server+0xb/0x25 [nfsv3] It's repeated many times. The workload was an unremarkable software build and regression test suite on an NFSv3 mount with RDMA. -- Chuck Lever ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: qedr memory leak report 2019-08-30 18:03 qedr memory leak report Chuck Lever @ 2019-08-30 18:27 ` Chuck Lever 2019-08-31 7:30 ` Leon Romanovsky 2019-09-02 7:53 ` [EXT] " Michal Kalderon 0 siblings, 2 replies; 9+ messages in thread From: Chuck Lever @ 2019-08-30 18:27 UTC (permalink / raw) To: Michal Kalderon; +Cc: linux-rdma > On Aug 30, 2019, at 2:03 PM, Chuck Lever <chuck.lever@oracle.com> wrote: > > Hi Michal- > > In the middle of some other testing, I got this kmemleak report > while testing with FastLinq cards in iWARP mode: > > unreferenced object 0xffff888458923340 (size 32): > comm "mount.nfs", pid 2294, jiffies 4298338848 (age 1144.337s) > hex dump (first 32 bytes): > 20 1d 69 63 88 88 ff ff 20 1d 69 63 88 88 ff ff .ic.... .ic.... > 00 60 7a 69 84 88 ff ff 00 60 82 f9 00 00 00 00 .`zi.....`...... > backtrace: > [<000000000df5bfed>] __kmalloc+0x128/0x176 > [<0000000020724641>] qedr_alloc_pbl_tbl.constprop.44+0x3c/0x121 [qedr] > [<00000000a361c591>] init_mr_info.constprop.41+0xaf/0x21f [qedr] > [<00000000e8049714>] qedr_alloc_mr+0x95/0x2c1 [qedr] > [<000000000e6102bc>] ib_alloc_mr_user+0x31/0x96 [ib_core] > [<00000000d254a9fb>] frwr_init_mr+0x23/0x121 [rpcrdma] > [<00000000a0364e35>] rpcrdma_mrs_create+0x45/0xea [rpcrdma] > [<00000000fd6bf282>] rpcrdma_buffer_create+0x9e/0x1c9 [rpcrdma] > [<00000000be3a1eba>] xprt_setup_rdma+0x109/0x279 [rpcrdma] > [<00000000b736b88f>] xprt_create_transport+0x39/0x19a [sunrpc] > [<000000001024e4dc>] rpc_create+0x118/0x1ab [sunrpc] > [<00000000cca43a49>] nfs_create_rpc_client+0xf8/0x15f [nfs] > [<00000000073c962c>] nfs_init_client+0x1a/0x3b [nfs] > [<00000000b03964c4>] nfs_init_server+0xc1/0x212 [nfs] > [<000000001c71f609>] nfs_create_server+0x74/0x1a4 [nfs] > [<000000004dc919a1>] nfs3_create_server+0xb/0x25 [nfsv3] > > It's repeated many times. > > The workload was an unremarkable software build and regression test > suite on an NFSv3 mount with RDMA. Also seeing one of these per NFS mount: unreferenced object 0xffff888869f39b40 (size 64): comm "kworker/u28:0", pid 17569, jiffies 4299267916 (age 1592.907s) hex dump (first 32 bytes): 00 80 53 6d 88 88 ff ff 00 00 00 00 00 00 00 00 ..Sm............ 00 48 e2 66 84 88 ff ff 00 00 00 00 00 00 00 00 .H.f............ backtrace: [<0000000063e652dd>] kmem_cache_alloc_trace+0xed/0x133 [<0000000083b1e912>] qedr_iw_connect+0xf9/0x3c8 [qedr] [<00000000553be951>] iw_cm_connect+0xd0/0x157 [iw_cm] [<00000000b086730c>] rdma_connect+0x54e/0x5b0 [rdma_cm] [<00000000d8af3cf2>] rpcrdma_ep_connect+0x22b/0x360 [rpcrdma] [<000000006a413c8d>] xprt_rdma_connect_worker+0x24/0x88 [rpcrdma] [<000000001c5b049a>] process_one_work+0x196/0x2c6 [<000000007e3403ba>] worker_thread+0x1ad/0x261 [<000000001daaa973>] kthread+0xf4/0xf9 [<0000000014987b31>] ret_from_fork+0x24/0x30 Looks like this one is not being freed: 514 ep = kzalloc(sizeof(*ep), GFP_KERNEL); 515 if (!ep) 516 return -ENOMEM; -- Chuck Lever ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: qedr memory leak report 2019-08-30 18:27 ` Chuck Lever @ 2019-08-31 7:30 ` Leon Romanovsky 2019-08-31 14:33 ` Doug Ledford 2019-09-02 7:53 ` [EXT] " Michal Kalderon 1 sibling, 1 reply; 9+ messages in thread From: Leon Romanovsky @ 2019-08-31 7:30 UTC (permalink / raw) To: Doug Ledford; +Cc: Chuck Lever, Michal Kalderon, linux-rdma Doug, I think that it can be counted as good example why allowing memory leaks in drivers (HNS) is not so great idea. Thanks On Fri, Aug 30, 2019 at 02:27:49PM -0400, Chuck Lever wrote: > > > On Aug 30, 2019, at 2:03 PM, Chuck Lever <chuck.lever@oracle.com> wrote: > > > > Hi Michal- > > > > In the middle of some other testing, I got this kmemleak report > > while testing with FastLinq cards in iWARP mode: > > > > unreferenced object 0xffff888458923340 (size 32): > > comm "mount.nfs", pid 2294, jiffies 4298338848 (age 1144.337s) > > hex dump (first 32 bytes): > > 20 1d 69 63 88 88 ff ff 20 1d 69 63 88 88 ff ff .ic.... .ic.... > > 00 60 7a 69 84 88 ff ff 00 60 82 f9 00 00 00 00 .`zi.....`...... > > backtrace: > > [<000000000df5bfed>] __kmalloc+0x128/0x176 > > [<0000000020724641>] qedr_alloc_pbl_tbl.constprop.44+0x3c/0x121 [qedr] > > [<00000000a361c591>] init_mr_info.constprop.41+0xaf/0x21f [qedr] > > [<00000000e8049714>] qedr_alloc_mr+0x95/0x2c1 [qedr] > > [<000000000e6102bc>] ib_alloc_mr_user+0x31/0x96 [ib_core] > > [<00000000d254a9fb>] frwr_init_mr+0x23/0x121 [rpcrdma] > > [<00000000a0364e35>] rpcrdma_mrs_create+0x45/0xea [rpcrdma] > > [<00000000fd6bf282>] rpcrdma_buffer_create+0x9e/0x1c9 [rpcrdma] > > [<00000000be3a1eba>] xprt_setup_rdma+0x109/0x279 [rpcrdma] > > [<00000000b736b88f>] xprt_create_transport+0x39/0x19a [sunrpc] > > [<000000001024e4dc>] rpc_create+0x118/0x1ab [sunrpc] > > [<00000000cca43a49>] nfs_create_rpc_client+0xf8/0x15f [nfs] > > [<00000000073c962c>] nfs_init_client+0x1a/0x3b [nfs] > > [<00000000b03964c4>] nfs_init_server+0xc1/0x212 [nfs] > > [<000000001c71f609>] nfs_create_server+0x74/0x1a4 [nfs] > > [<000000004dc919a1>] nfs3_create_server+0xb/0x25 [nfsv3] > > > > It's repeated many times. > > > > The workload was an unremarkable software build and regression test > > suite on an NFSv3 mount with RDMA. > > Also seeing one of these per NFS mount: > > unreferenced object 0xffff888869f39b40 (size 64): > comm "kworker/u28:0", pid 17569, jiffies 4299267916 (age 1592.907s) > hex dump (first 32 bytes): > 00 80 53 6d 88 88 ff ff 00 00 00 00 00 00 00 00 ..Sm............ > 00 48 e2 66 84 88 ff ff 00 00 00 00 00 00 00 00 .H.f............ > backtrace: > [<0000000063e652dd>] kmem_cache_alloc_trace+0xed/0x133 > [<0000000083b1e912>] qedr_iw_connect+0xf9/0x3c8 [qedr] > [<00000000553be951>] iw_cm_connect+0xd0/0x157 [iw_cm] > [<00000000b086730c>] rdma_connect+0x54e/0x5b0 [rdma_cm] > [<00000000d8af3cf2>] rpcrdma_ep_connect+0x22b/0x360 [rpcrdma] > [<000000006a413c8d>] xprt_rdma_connect_worker+0x24/0x88 [rpcrdma] > [<000000001c5b049a>] process_one_work+0x196/0x2c6 > [<000000007e3403ba>] worker_thread+0x1ad/0x261 > [<000000001daaa973>] kthread+0xf4/0xf9 > [<0000000014987b31>] ret_from_fork+0x24/0x30 > > Looks like this one is not being freed: > > 514 ep = kzalloc(sizeof(*ep), GFP_KERNEL); > 515 if (!ep) > 516 return -ENOMEM; > > > -- > Chuck Lever > > > ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: qedr memory leak report 2019-08-31 7:30 ` Leon Romanovsky @ 2019-08-31 14:33 ` Doug Ledford 2019-08-31 15:19 ` Leon Romanovsky 0 siblings, 1 reply; 9+ messages in thread From: Doug Ledford @ 2019-08-31 14:33 UTC (permalink / raw) To: Leon Romanovsky; +Cc: Chuck Lever, Michal Kalderon, linux-rdma [-- Attachment #1: Type: text/plain, Size: 3591 bytes --] On Sat, 2019-08-31 at 10:30 +0300, Leon Romanovsky wrote: > Doug, > > I think that it can be counted as good example why allowing memory > leaks > in drivers (HNS) is not so great idea. Crashing the machine is worse. > Thanks > > On Fri, Aug 30, 2019 at 02:27:49PM -0400, Chuck Lever wrote: > > > On Aug 30, 2019, at 2:03 PM, Chuck Lever <chuck.lever@oracle.com> > > > wrote: > > > > > > Hi Michal- > > > > > > In the middle of some other testing, I got this kmemleak report > > > while testing with FastLinq cards in iWARP mode: > > > > > > unreferenced object 0xffff888458923340 (size 32): > > > comm "mount.nfs", pid 2294, jiffies 4298338848 (age 1144.337s) > > > hex dump (first 32 bytes): > > > 20 1d 69 63 88 88 ff ff 20 1d 69 63 88 88 ff ff .ic.... > > > .ic.... > > > 00 60 7a 69 84 88 ff ff 00 60 82 f9 00 00 00 > > > 00 .`zi.....`...... > > > backtrace: > > > [<000000000df5bfed>] __kmalloc+0x128/0x176 > > > [<0000000020724641>] qedr_alloc_pbl_tbl.constprop.44+0x3c/0x121 > > > [qedr] > > > [<00000000a361c591>] init_mr_info.constprop.41+0xaf/0x21f > > > [qedr] > > > [<00000000e8049714>] qedr_alloc_mr+0x95/0x2c1 [qedr] > > > [<000000000e6102bc>] ib_alloc_mr_user+0x31/0x96 [ib_core] > > > [<00000000d254a9fb>] frwr_init_mr+0x23/0x121 [rpcrdma] > > > [<00000000a0364e35>] rpcrdma_mrs_create+0x45/0xea [rpcrdma] > > > [<00000000fd6bf282>] rpcrdma_buffer_create+0x9e/0x1c9 [rpcrdma] > > > [<00000000be3a1eba>] xprt_setup_rdma+0x109/0x279 [rpcrdma] > > > [<00000000b736b88f>] xprt_create_transport+0x39/0x19a [sunrpc] > > > [<000000001024e4dc>] rpc_create+0x118/0x1ab [sunrpc] > > > [<00000000cca43a49>] nfs_create_rpc_client+0xf8/0x15f [nfs] > > > [<00000000073c962c>] nfs_init_client+0x1a/0x3b [nfs] > > > [<00000000b03964c4>] nfs_init_server+0xc1/0x212 [nfs] > > > [<000000001c71f609>] nfs_create_server+0x74/0x1a4 [nfs] > > > [<000000004dc919a1>] nfs3_create_server+0xb/0x25 [nfsv3] > > > > > > It's repeated many times. > > > > > > The workload was an unremarkable software build and regression > > > test > > > suite on an NFSv3 mount with RDMA. > > > > Also seeing one of these per NFS mount: > > > > unreferenced object 0xffff888869f39b40 (size 64): > > comm "kworker/u28:0", pid 17569, jiffies 4299267916 (age > > 1592.907s) > > hex dump (first 32 bytes): > > 00 80 53 6d 88 88 ff ff 00 00 00 00 00 00 00 > > 00 ..Sm............ > > 00 48 e2 66 84 88 ff ff 00 00 00 00 00 00 00 > > 00 .H.f............ > > backtrace: > > [<0000000063e652dd>] kmem_cache_alloc_trace+0xed/0x133 > > [<0000000083b1e912>] qedr_iw_connect+0xf9/0x3c8 [qedr] > > [<00000000553be951>] iw_cm_connect+0xd0/0x157 [iw_cm] > > [<00000000b086730c>] rdma_connect+0x54e/0x5b0 [rdma_cm] > > [<00000000d8af3cf2>] rpcrdma_ep_connect+0x22b/0x360 [rpcrdma] > > [<000000006a413c8d>] xprt_rdma_connect_worker+0x24/0x88 > > [rpcrdma] > > [<000000001c5b049a>] process_one_work+0x196/0x2c6 > > [<000000007e3403ba>] worker_thread+0x1ad/0x261 > > [<000000001daaa973>] kthread+0xf4/0xf9 > > [<0000000014987b31>] ret_from_fork+0x24/0x30 > > > > Looks like this one is not being freed: > > > > 514 ep = kzalloc(sizeof(*ep), GFP_KERNEL); > > 515 if (!ep) > > 516 return -ENOMEM; > > > > > > -- > > Chuck Lever > > > > > > -- Doug Ledford <dledford@redhat.com> GPG KeyID: B826A3330E572FDD Fingerprint = AE6B 1BDA 122B 23B4 265B 1274 B826 A333 0E57 2FDD [-- Attachment #2: This is a digitally signed message part --] [-- Type: application/pgp-signature, Size: 833 bytes --] ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: qedr memory leak report 2019-08-31 14:33 ` Doug Ledford @ 2019-08-31 15:19 ` Leon Romanovsky 2019-08-31 17:17 ` Doug Ledford 0 siblings, 1 reply; 9+ messages in thread From: Leon Romanovsky @ 2019-08-31 15:19 UTC (permalink / raw) To: Doug Ledford; +Cc: Chuck Lever, Michal Kalderon, linux-rdma On Sat, Aug 31, 2019 at 10:33:13AM -0400, Doug Ledford wrote: > On Sat, 2019-08-31 at 10:30 +0300, Leon Romanovsky wrote: > > Doug, > > > > I think that it can be counted as good example why allowing memory > > leaks > > in drivers (HNS) is not so great idea. > > Crashing the machine is worse. The problem with it that you are "punishing" whole subsystem because of some piece of crap which anyway users can't buy. If HNS wants to have memory leaks, they need to do it outside of upstream kernel. In general, if users buy shitty hardware, they need to be ready to have kernel panics too. It works with faulty DRAM where kernel doesn't hide such failures, so don't see any rationale to invent something special for ib_device. Thanks ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: qedr memory leak report 2019-08-31 15:19 ` Leon Romanovsky @ 2019-08-31 17:17 ` Doug Ledford 2019-08-31 18:55 ` Leon Romanovsky 0 siblings, 1 reply; 9+ messages in thread From: Doug Ledford @ 2019-08-31 17:17 UTC (permalink / raw) To: Leon Romanovsky; +Cc: Chuck Lever, Michal Kalderon, linux-rdma [-- Attachment #1: Type: text/plain, Size: 2041 bytes --] On Sat, 2019-08-31 at 18:19 +0300, Leon Romanovsky wrote: > On Sat, Aug 31, 2019 at 10:33:13AM -0400, Doug Ledford wrote: > > On Sat, 2019-08-31 at 10:30 +0300, Leon Romanovsky wrote: > > > Doug, > > > > > > I think that it can be counted as good example why allowing memory > > > leaks > > > in drivers (HNS) is not so great idea. > > > > Crashing the machine is worse. > > The problem with it that you are "punishing" whole subsystem > because of some piece of crap which anyway users can't buy. No I'm not. The patch in question was in the hns driver and only leaked resources assigned to the hns card when the hns card timed out in freeing those resources. That doesn't punish the entire subsystem, it only punishes the users of that card, and then only if the card has flaked out. > If HNS wants to have memory leaks, they need to do it outside > of upstream kernel. Nope. > In general, if users buy shitty hardware, they need to be ready > to have kernel panics too. It works with faulty DRAM where kernel > doesn't hide such failures, so don't see any rationale to invent > something special for ib_device. What you are advocating for is not "shitty DRAM crashing the machine", you are advocating for "having ECC DRAM and then intentionally turning the ECC off and then crashing the machine". Please repeat after me: WE DONT CRASH MACHINES. PERIOD. If it is avoidable, we avoid it. That's why BUG_ONs have to go and why they piss Linus off so much. If you crash the machine, people are left scratching their head and asking why. If you don't crash the machine, they have a chance to debug the issue and resolve it. The entire idea that you are advocating for crashing the machine as being preferable to leaking a few resources is ludicrous. WE DONT CRASH MACHINES. PERIOD. Please repeat that until it fully sinks in. > Thanks -- Doug Ledford <dledford@redhat.com> GPG KeyID: B826A3330E572FDD Fingerprint = AE6B 1BDA 122B 23B4 265B 1274 B826 A333 0E57 2FDD [-- Attachment #2: This is a digitally signed message part --] [-- Type: application/pgp-signature, Size: 833 bytes --] ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: qedr memory leak report 2019-08-31 17:17 ` Doug Ledford @ 2019-08-31 18:55 ` Leon Romanovsky 0 siblings, 0 replies; 9+ messages in thread From: Leon Romanovsky @ 2019-08-31 18:55 UTC (permalink / raw) To: Doug Ledford; +Cc: Chuck Lever, Michal Kalderon, linux-rdma On Sat, Aug 31, 2019 at 01:17:05PM -0400, Doug Ledford wrote: > On Sat, 2019-08-31 at 18:19 +0300, Leon Romanovsky wrote: > > On Sat, Aug 31, 2019 at 10:33:13AM -0400, Doug Ledford wrote: > > > On Sat, 2019-08-31 at 10:30 +0300, Leon Romanovsky wrote: > > > > Doug, > > > > > > > > I think that it can be counted as good example why allowing memory > > > > leaks > > > > in drivers (HNS) is not so great idea. > > > > > > Crashing the machine is worse. > > > > The problem with it that you are "punishing" whole subsystem > > because of some piece of crap which anyway users can't buy. > > No I'm not. The patch in question was in the hns driver and only leaked > resources assigned to the hns card when the hns card timed out in > freeing those resources. That doesn't punish the entire subsystem, it > only punishes the users of that card, and then only if the card has > flaked out. Unfortunately, but you are. Our model is based on the fact that destroy operations can't fail and all allocations performed by IB/core should be released right after call to relevant destroy callback. The fact that you are allowing to one driver don't success in destroy, means that you will need to allow to everyone chance to return errors and skip freeing resources. > > > If HNS wants to have memory leaks, they need to do it outside > > of upstream kernel. > > Nope. > > > In general, if users buy shitty hardware, they need to be ready > > to have kernel panics too. It works with faulty DRAM where kernel > > doesn't hide such failures, so don't see any rationale to invent > > something special for ib_device. > > What you are advocating for is not "shitty DRAM crashing the machine", > you are advocating for "having ECC DRAM and then intentionally turning > the ECC off and then crashing the machine". Please repeat after me: WE > DONT CRASH MACHINES. PERIOD. If it is avoidable, we avoid it. That's > why BUG_ONs have to go and why they piss Linus off so much. If you > crash the machine, people are left scratching their head and asking why. > If you don't crash the machine, they have a chance to debug the issue > and resolve it. The entire idea that you are advocating for crashing > the machine as being preferable to leaking a few resources is ludicrous. > WE DONT CRASH MACHINES. PERIOD. Please repeat that until it fully > sinks in. I'm not advocating for that and I don't buy explanation that freeing memory will cause for machine to crash, at the end, freed memory means that user won't have access to such bad resource. Thanks > > > Thanks > > -- > Doug Ledford <dledford@redhat.com> > GPG KeyID: B826A3330E572FDD > Fingerprint = AE6B 1BDA 122B 23B4 265B 1274 B826 A333 0E57 2FDD ^ permalink raw reply [flat|nested] 9+ messages in thread
* RE: [EXT] Re: qedr memory leak report 2019-08-30 18:27 ` Chuck Lever 2019-08-31 7:30 ` Leon Romanovsky @ 2019-09-02 7:53 ` Michal Kalderon 2019-09-03 12:53 ` Chuck Lever 1 sibling, 1 reply; 9+ messages in thread From: Michal Kalderon @ 2019-09-02 7:53 UTC (permalink / raw) To: Chuck Lever, Michal Kalderon; +Cc: linux-rdma > From: Chuck Lever <chuck.lever@oracle.com> > Sent: Friday, August 30, 2019 9:28 PM > > External Email > > ---------------------------------------------------------------------- > > > On Aug 30, 2019, at 2:03 PM, Chuck Lever <chuck.lever@oracle.com> > wrote: > > > > Hi Michal- > > > > In the middle of some other testing, I got this kmemleak report while > > testing with FastLinq cards in iWARP mode: > > > > unreferenced object 0xffff888458923340 (size 32): > > comm "mount.nfs", pid 2294, jiffies 4298338848 (age 1144.337s) hex > > dump (first 32 bytes): > > 20 1d 69 63 88 88 ff ff 20 1d 69 63 88 88 ff ff .ic.... .ic.... > > 00 60 7a 69 84 88 ff ff 00 60 82 f9 00 00 00 00 .`zi.....`...... > > backtrace: > > [<000000000df5bfed>] __kmalloc+0x128/0x176 > > [<0000000020724641>] qedr_alloc_pbl_tbl.constprop.44+0x3c/0x121 > [qedr] > > [<00000000a361c591>] init_mr_info.constprop.41+0xaf/0x21f [qedr] > > [<00000000e8049714>] qedr_alloc_mr+0x95/0x2c1 [qedr] > > [<000000000e6102bc>] ib_alloc_mr_user+0x31/0x96 [ib_core] > > [<00000000d254a9fb>] frwr_init_mr+0x23/0x121 [rpcrdma] > > [<00000000a0364e35>] rpcrdma_mrs_create+0x45/0xea [rpcrdma] > > [<00000000fd6bf282>] rpcrdma_buffer_create+0x9e/0x1c9 [rpcrdma] > > [<00000000be3a1eba>] xprt_setup_rdma+0x109/0x279 [rpcrdma] > > [<00000000b736b88f>] xprt_create_transport+0x39/0x19a [sunrpc] > > [<000000001024e4dc>] rpc_create+0x118/0x1ab [sunrpc] > > [<00000000cca43a49>] nfs_create_rpc_client+0xf8/0x15f [nfs] > > [<00000000073c962c>] nfs_init_client+0x1a/0x3b [nfs] > > [<00000000b03964c4>] nfs_init_server+0xc1/0x212 [nfs] > > [<000000001c71f609>] nfs_create_server+0x74/0x1a4 [nfs] > > [<000000004dc919a1>] nfs3_create_server+0xb/0x25 [nfsv3] > > > > It's repeated many times. > > > > The workload was an unremarkable software build and regression test > > suite on an NFSv3 mount with RDMA. > > Also seeing one of these per NFS mount: > > unreferenced object 0xffff888869f39b40 (size 64): > comm "kworker/u28:0", pid 17569, jiffies 4299267916 (age 1592.907s) > hex dump (first 32 bytes): > 00 80 53 6d 88 88 ff ff 00 00 00 00 00 00 00 00 ..Sm............ > 00 48 e2 66 84 88 ff ff 00 00 00 00 00 00 00 00 .H.f............ > backtrace: > [<0000000063e652dd>] kmem_cache_alloc_trace+0xed/0x133 > [<0000000083b1e912>] qedr_iw_connect+0xf9/0x3c8 [qedr] > [<00000000553be951>] iw_cm_connect+0xd0/0x157 [iw_cm] > [<00000000b086730c>] rdma_connect+0x54e/0x5b0 [rdma_cm] > [<00000000d8af3cf2>] rpcrdma_ep_connect+0x22b/0x360 [rpcrdma] > [<000000006a413c8d>] xprt_rdma_connect_worker+0x24/0x88 [rpcrdma] > [<000000001c5b049a>] process_one_work+0x196/0x2c6 > [<000000007e3403ba>] worker_thread+0x1ad/0x261 > [<000000001daaa973>] kthread+0xf4/0xf9 > [<0000000014987b31>] ret_from_fork+0x24/0x30 > > Looks like this one is not being freed: > > 514 ep = kzalloc(sizeof(*ep), GFP_KERNEL); > 515 if (!ep) > 516 return -ENOMEM; > > Thanks Chuck! I'll take care of this. Is there an easy repro for getting the leak ? Thanks, Michal > -- > Chuck Lever > > ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [EXT] Re: qedr memory leak report 2019-09-02 7:53 ` [EXT] " Michal Kalderon @ 2019-09-03 12:53 ` Chuck Lever 0 siblings, 0 replies; 9+ messages in thread From: Chuck Lever @ 2019-09-03 12:53 UTC (permalink / raw) To: Michal Kalderon; +Cc: Michal Kalderon, linux-rdma On Sep 2, 2019, at 3:53 AM, Michal Kalderon <mkalderon@marvell.com> wrote: >> From: Chuck Lever <chuck.lever@oracle.com> >> Sent: Friday, August 30, 2019 9:28 PM >> >> External Email >> >> ---------------------------------------------------------------------- >> >>> On Aug 30, 2019, at 2:03 PM, Chuck Lever <chuck.lever@oracle.com> >> wrote: >>> >>> Hi Michal- >>> >>> In the middle of some other testing, I got this kmemleak report while >>> testing with FastLinq cards in iWARP mode: >>> >>> unreferenced object 0xffff888458923340 (size 32): >>> comm "mount.nfs", pid 2294, jiffies 4298338848 (age 1144.337s) hex >>> dump (first 32 bytes): >>> 20 1d 69 63 88 88 ff ff 20 1d 69 63 88 88 ff ff .ic.... .ic.... >>> 00 60 7a 69 84 88 ff ff 00 60 82 f9 00 00 00 00 .`zi.....`...... >>> backtrace: >>> [<000000000df5bfed>] __kmalloc+0x128/0x176 >>> [<0000000020724641>] qedr_alloc_pbl_tbl.constprop.44+0x3c/0x121 >> [qedr] >>> [<00000000a361c591>] init_mr_info.constprop.41+0xaf/0x21f [qedr] >>> [<00000000e8049714>] qedr_alloc_mr+0x95/0x2c1 [qedr] >>> [<000000000e6102bc>] ib_alloc_mr_user+0x31/0x96 [ib_core] >>> [<00000000d254a9fb>] frwr_init_mr+0x23/0x121 [rpcrdma] >>> [<00000000a0364e35>] rpcrdma_mrs_create+0x45/0xea [rpcrdma] >>> [<00000000fd6bf282>] rpcrdma_buffer_create+0x9e/0x1c9 [rpcrdma] >>> [<00000000be3a1eba>] xprt_setup_rdma+0x109/0x279 [rpcrdma] >>> [<00000000b736b88f>] xprt_create_transport+0x39/0x19a [sunrpc] >>> [<000000001024e4dc>] rpc_create+0x118/0x1ab [sunrpc] >>> [<00000000cca43a49>] nfs_create_rpc_client+0xf8/0x15f [nfs] >>> [<00000000073c962c>] nfs_init_client+0x1a/0x3b [nfs] >>> [<00000000b03964c4>] nfs_init_server+0xc1/0x212 [nfs] >>> [<000000001c71f609>] nfs_create_server+0x74/0x1a4 [nfs] >>> [<000000004dc919a1>] nfs3_create_server+0xb/0x25 [nfsv3] >>> >>> It's repeated many times. >>> >>> The workload was an unremarkable software build and regression test >>> suite on an NFSv3 mount with RDMA. >> >> Also seeing one of these per NFS mount: >> >> unreferenced object 0xffff888869f39b40 (size 64): >> comm "kworker/u28:0", pid 17569, jiffies 4299267916 (age 1592.907s) >> hex dump (first 32 bytes): >> 00 80 53 6d 88 88 ff ff 00 00 00 00 00 00 00 00 ..Sm............ >> 00 48 e2 66 84 88 ff ff 00 00 00 00 00 00 00 00 .H.f............ >> backtrace: >> [<0000000063e652dd>] kmem_cache_alloc_trace+0xed/0x133 >> [<0000000083b1e912>] qedr_iw_connect+0xf9/0x3c8 [qedr] >> [<00000000553be951>] iw_cm_connect+0xd0/0x157 [iw_cm] >> [<00000000b086730c>] rdma_connect+0x54e/0x5b0 [rdma_cm] >> [<00000000d8af3cf2>] rpcrdma_ep_connect+0x22b/0x360 [rpcrdma] >> [<000000006a413c8d>] xprt_rdma_connect_worker+0x24/0x88 [rpcrdma] >> [<000000001c5b049a>] process_one_work+0x196/0x2c6 >> [<000000007e3403ba>] worker_thread+0x1ad/0x261 >> [<000000001daaa973>] kthread+0xf4/0xf9 >> [<0000000014987b31>] ret_from_fork+0x24/0x30 >> >> Looks like this one is not being freed: >> >> 514 ep = kzalloc(sizeof(*ep), GFP_KERNEL); >> 515 if (!ep) >> 516 return -ENOMEM; >> >> > Thanks Chuck! I'll take care of this. Is there an easy repro for getting the leak ? Nothing special is necessary. Enable kmemleak detection, then run any NFS/RDMA workload that does some I/O, unmount, and wait a few minutes for the kmemleak laudromat thread to run. ^ permalink raw reply [flat|nested] 9+ messages in thread
end of thread, other threads:[~2019-09-03 12:54 UTC | newest] Thread overview: 9+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2019-08-30 18:03 qedr memory leak report Chuck Lever 2019-08-30 18:27 ` Chuck Lever 2019-08-31 7:30 ` Leon Romanovsky 2019-08-31 14:33 ` Doug Ledford 2019-08-31 15:19 ` Leon Romanovsky 2019-08-31 17:17 ` Doug Ledford 2019-08-31 18:55 ` Leon Romanovsky 2019-09-02 7:53 ` [EXT] " Michal Kalderon 2019-09-03 12:53 ` Chuck Lever
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).