linux-rdma.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* isert patch leaving resources behind
@ 2023-08-10 14:44 Dennis Dalessandro
  2023-08-13  8:29 ` Leon Romanovsky
  2023-08-13 14:18 ` Sagi Grimberg
  0 siblings, 2 replies; 11+ messages in thread
From: Dennis Dalessandro @ 2023-08-10 14:44 UTC (permalink / raw)
  To: OFED mailing list
  Cc: Ehab Ababneh, saravanan.vajravel, Selvin Xavier, Sagi Grimberg

Commit: 699826f4e30a ("IB/isert: Fix incorrect release of isert connection") is
causing problems on OPA when we try to unload the driver after doing iSCI
testing. Reverting this commit causes the problem to go away. Any ideas? Was
testing done on this patch with removing/hotplugging drivers?

[29151.413816] ------------[ cut here ]------------
[29151.419086] WARNING: CPU: 52 PID: 2117247 at drivers/infiniband/core/cq.c:359
ib_cq_pool_cleanup+0xac/0xb0 [ib_core]
[29151.431096] Modules linked in: nfsd nfs_acl target_core_user uio tcm_fc libfc
scsi_transport_fc tcm_loop target_core_pscsi target_core_iblock target_core_file
rpcsec_gss_krb5 auth_rpcgss nfsv4 dns_resolver nfs lockd grace fscache netfs
rfkill rpcrdma rdma_ucm ib_srpt sunrpc ib_isert iscsi_target_mod target_core_mod
opa_vnic ib_iser libiscsi ib_umad scsi_transport_iscsi rdma_cm ib_ipoib iw_cm
ib_cm hfi1(-) rdmavt ib_uverbs intel_rapl_msr intel_rapl_common sb_edac ib_core
x86_pkg_temp_thermal intel_powerclamp coretemp i2c_i801 mxm_wmi rapl iTCO_wdt
ipmi_si iTCO_vendor_support mei_me ipmi_devintf mei intel_cstate ioatdma
intel_uncore i2c_smbus joydev pcspkr lpc_ich ipmi_msghandler acpi_power_meter
acpi_pad xfs libcrc32c sr_mod sd_mod cdrom t10_pi sg crct10dif_pclmul
crc32_pclmul crc32c_intel drm_kms_helper drm_shmem_helper ahci libahci
ghash_clmulni_intel igb drm libata dca i2c_algo_bit wmi fuse
[29151.520056] CPU: 52 PID: 2117247 Comm: modprobe Not tainted 6.5.0-rc1+ #1
[29151.527759] Hardware name: Intel Corporation S2600CWR/S2600CW, BIOS
SE5C610.86B.01.01.0014.121820151719 12/18/2015
[29151.539462] RIP: 0010:ib_cq_pool_cleanup+0xac/0xb0 [ib_core]
[29151.545908] Code: ff 48 8b 43 40 48 8d 7b 40 48 83 e8 40 4c 39 e7 75 b3 49 83
c4 10 4d 39 fc 75 94 5b 5d 41 5c 41 5d 41 5e 41 5f c3 cc cc cc cc <0f> 0b eb a1
90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 0f 1f
[29151.567086] RSP: 0018:ffffc10bea13fc80 EFLAGS: 00010206
[29151.573040] RAX: 000000000000010c RBX: ffff9bf5c7e66c00 RCX: 000000008020001d
[29151.581120] RDX: 000000008020001e RSI: fffff175221f9900 RDI: ffff9bf5c7e67640
[29151.589202] RBP: ffff9bf5c7e67600 R08: ffff9bf5c7e64400 R09: 000000008020001d
[29151.597280] R10: 0000000040000000 R11: 0000000000000000 R12: ffff9bee4b1e8a18
[29151.605360] R13: dead000000000122 R14: dead000000000100 R15: ffff9bee4b1e8a38
[29151.613437] FS:  00007ff1e6d38740(0000) GS:ffff9bfd9fb00000(0000)
knlGS:0000000000000000
[29151.622610] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[29151.629133] CR2: 00005652044ecc68 CR3: 0000000889b5c005 CR4: 00000000001706e0
[29151.637212] Call Trace:
[29151.640063]  <TASK>
[29151.642500]  ? __warn+0x80/0x130
[29151.646209]  ? ib_cq_pool_cleanup+0xac/0xb0 [ib_core]
[29151.651997]  ? report_bug+0x195/0x1a0
[29151.656191]  ? handle_bug+0x3c/0x70
[29151.660190]  ? exc_invalid_op+0x14/0x70
[29151.664574]  ? asm_exc_invalid_op+0x16/0x20
[29151.669352]  ? ib_cq_pool_cleanup+0xac/0xb0 [ib_core]
[29151.675107]  disable_device+0x9d/0x160 [ib_core]
[29151.680399]  __ib_unregister_device+0x42/0xb0 [ib_core]
[29151.686361]  ib_unregister_device+0x22/0x30 [ib_core]
[29151.692128]  rvt_unregister_device+0x20/0x90 [rdmavt]
[29151.697889]  hfi1_unregister_ib_device+0x16/0xf0 [hfi1]
[29151.703936]  remove_one+0x55/0x1a0 [hfi1]
[29151.708588]  pci_device_remove+0x36/0xa0
[29151.713076]  device_release_driver_internal+0x193/0x200
[29151.719035]  driver_detach+0x44/0x90
[29151.723137]  bus_remove_driver+0x69/0xf0
[29151.727619]  pci_unregister_driver+0x2a/0xb0
[29151.732490]  hfi1_mod_cleanup+0xc/0x3c [hfi1]
[29151.737516]  __do_sys_delete_module.constprop.0+0x17a/0x2f0
[29151.743844]  ? exit_to_user_mode_prepare+0xc4/0xd0
[29151.749298]  ? syscall_trace_enter.constprop.0+0x126/0x1a0
[29151.755527]  do_syscall_64+0x5c/0x90
[29151.759631]  ? syscall_exit_to_user_mode+0x12/0x30
[29151.765089]  ? do_syscall_64+0x69/0x90
[29151.769374]  ? syscall_exit_work+0x103/0x130
[29151.774243]  ? syscall_exit_to_user_mode+0x12/0x30
[29151.779716]  ? do_syscall_64+0x69/0x90
[29151.784020]  ? exc_page_fault+0x65/0x150
[29151.788499]  entry_SYSCALL_64_after_hwframe+0x6e/0xd8
[29151.794245] RIP: 0033:0x7ff1e643f5ab
[29151.798336] Code: 73 01 c3 48 8b 0d 75 a8 1b 00 f7 d8 64 89 01 48 83 c8 ff c3
66 2e 0f 1f 84 00 00 00 00 00 90 f3 0f 1e fa b8 b0 00 00 00 0f 05 <48> 3d 01 f0
ff ff 73 01 c3 48 8b 0d 45 a8 1b 00 f7 d8 64 89 01 48
[29151.819504] RSP: 002b:00007ffec9103cc8 EFLAGS: 00000206 ORIG_RAX:
00000000000000b0
[29151.828109] RAX: ffffffffffffffda RBX: 00005615267fdc50 RCX: 00007ff1e643f5ab
[29151.837575] RDX: 0000000000000000 RSI: 0000000000000800 RDI: 00005615267fdcb8
[29151.845654] RBP: 00005615267fdc50 R08: 0000000000000000 R09: 0000000000000000
[29151.853739] R10: 00007ff1e659eac0 R11: 0000000000000206 R12: 00005615267fdcb8
[29151.861817] R13: 0000000000000000 R14: 00005615267fdcb8 R15: 00007ffec9105ff8
[29151.869896]  </TASK>
[29151.872423] ---[ end trace 0000000000000000 ]---

And...

[29158.533739] restrack: ------------[ cut here ]------------
[29158.540002] infiniband hfi1_0: BUG: RESTRACK detected leak of resources
[29158.547499] restrack: Kernel PD object allocated by ib_isert is not freed
[29158.555193] restrack: Kernel CQ object allocated by ib_core is not freed
[29158.562801] restrack: Kernel QP object allocated by rdma_cm is not freed
[29158.570395] restrack: ------------[ cut here ]------------


-Denny

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: isert patch leaving resources behind
  2023-08-10 14:44 isert patch leaving resources behind Dennis Dalessandro
@ 2023-08-13  8:29 ` Leon Romanovsky
  2023-08-20  9:46   ` Leon Romanovsky
  2023-08-13 14:18 ` Sagi Grimberg
  1 sibling, 1 reply; 11+ messages in thread
From: Leon Romanovsky @ 2023-08-13  8:29 UTC (permalink / raw)
  To: Dennis Dalessandro, saravanan.vajravel
  Cc: OFED mailing list, Ehab Ababneh, Selvin Xavier, Sagi Grimberg

On Thu, Aug 10, 2023 at 10:44:00AM -0400, Dennis Dalessandro wrote:
> Commit: 699826f4e30a ("IB/isert: Fix incorrect release of isert connection") is
> causing problems on OPA when we try to unload the driver after doing iSCI
> testing. Reverting this commit causes the problem to go away. Any ideas?


Saravanan, can you please post kernel logs as you wrote "When a bunch of iSER target
is cleared, this issue can lead to use-after-free memory issue as isert conn is twice
released" in the reverted commit?

Thanks

> Was testing done on this patch with removing/hotplugging drivers?
> 
> [29151.413816] ------------[ cut here ]------------
> [29151.419086] WARNING: CPU: 52 PID: 2117247 at drivers/infiniband/core/cq.c:359
> ib_cq_pool_cleanup+0xac/0xb0 [ib_core]
> [29151.431096] Modules linked in: nfsd nfs_acl target_core_user uio tcm_fc libfc
> scsi_transport_fc tcm_loop target_core_pscsi target_core_iblock target_core_file
> rpcsec_gss_krb5 auth_rpcgss nfsv4 dns_resolver nfs lockd grace fscache netfs
> rfkill rpcrdma rdma_ucm ib_srpt sunrpc ib_isert iscsi_target_mod target_core_mod
> opa_vnic ib_iser libiscsi ib_umad scsi_transport_iscsi rdma_cm ib_ipoib iw_cm
> ib_cm hfi1(-) rdmavt ib_uverbs intel_rapl_msr intel_rapl_common sb_edac ib_core
> x86_pkg_temp_thermal intel_powerclamp coretemp i2c_i801 mxm_wmi rapl iTCO_wdt
> ipmi_si iTCO_vendor_support mei_me ipmi_devintf mei intel_cstate ioatdma
> intel_uncore i2c_smbus joydev pcspkr lpc_ich ipmi_msghandler acpi_power_meter
> acpi_pad xfs libcrc32c sr_mod sd_mod cdrom t10_pi sg crct10dif_pclmul
> crc32_pclmul crc32c_intel drm_kms_helper drm_shmem_helper ahci libahci
> ghash_clmulni_intel igb drm libata dca i2c_algo_bit wmi fuse
> [29151.520056] CPU: 52 PID: 2117247 Comm: modprobe Not tainted 6.5.0-rc1+ #1
> [29151.527759] Hardware name: Intel Corporation S2600CWR/S2600CW, BIOS
> SE5C610.86B.01.01.0014.121820151719 12/18/2015
> [29151.539462] RIP: 0010:ib_cq_pool_cleanup+0xac/0xb0 [ib_core]
> [29151.545908] Code: ff 48 8b 43 40 48 8d 7b 40 48 83 e8 40 4c 39 e7 75 b3 49 83
> c4 10 4d 39 fc 75 94 5b 5d 41 5c 41 5d 41 5e 41 5f c3 cc cc cc cc <0f> 0b eb a1
> 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 0f 1f
> [29151.567086] RSP: 0018:ffffc10bea13fc80 EFLAGS: 00010206
> [29151.573040] RAX: 000000000000010c RBX: ffff9bf5c7e66c00 RCX: 000000008020001d
> [29151.581120] RDX: 000000008020001e RSI: fffff175221f9900 RDI: ffff9bf5c7e67640
> [29151.589202] RBP: ffff9bf5c7e67600 R08: ffff9bf5c7e64400 R09: 000000008020001d
> [29151.597280] R10: 0000000040000000 R11: 0000000000000000 R12: ffff9bee4b1e8a18
> [29151.605360] R13: dead000000000122 R14: dead000000000100 R15: ffff9bee4b1e8a38
> [29151.613437] FS:  00007ff1e6d38740(0000) GS:ffff9bfd9fb00000(0000)
> knlGS:0000000000000000
> [29151.622610] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> [29151.629133] CR2: 00005652044ecc68 CR3: 0000000889b5c005 CR4: 00000000001706e0
> [29151.637212] Call Trace:
> [29151.640063]  <TASK>
> [29151.642500]  ? __warn+0x80/0x130
> [29151.646209]  ? ib_cq_pool_cleanup+0xac/0xb0 [ib_core]
> [29151.651997]  ? report_bug+0x195/0x1a0
> [29151.656191]  ? handle_bug+0x3c/0x70
> [29151.660190]  ? exc_invalid_op+0x14/0x70
> [29151.664574]  ? asm_exc_invalid_op+0x16/0x20
> [29151.669352]  ? ib_cq_pool_cleanup+0xac/0xb0 [ib_core]
> [29151.675107]  disable_device+0x9d/0x160 [ib_core]
> [29151.680399]  __ib_unregister_device+0x42/0xb0 [ib_core]
> [29151.686361]  ib_unregister_device+0x22/0x30 [ib_core]
> [29151.692128]  rvt_unregister_device+0x20/0x90 [rdmavt]
> [29151.697889]  hfi1_unregister_ib_device+0x16/0xf0 [hfi1]
> [29151.703936]  remove_one+0x55/0x1a0 [hfi1]
> [29151.708588]  pci_device_remove+0x36/0xa0
> [29151.713076]  device_release_driver_internal+0x193/0x200
> [29151.719035]  driver_detach+0x44/0x90
> [29151.723137]  bus_remove_driver+0x69/0xf0
> [29151.727619]  pci_unregister_driver+0x2a/0xb0
> [29151.732490]  hfi1_mod_cleanup+0xc/0x3c [hfi1]
> [29151.737516]  __do_sys_delete_module.constprop.0+0x17a/0x2f0
> [29151.743844]  ? exit_to_user_mode_prepare+0xc4/0xd0
> [29151.749298]  ? syscall_trace_enter.constprop.0+0x126/0x1a0
> [29151.755527]  do_syscall_64+0x5c/0x90
> [29151.759631]  ? syscall_exit_to_user_mode+0x12/0x30
> [29151.765089]  ? do_syscall_64+0x69/0x90
> [29151.769374]  ? syscall_exit_work+0x103/0x130
> [29151.774243]  ? syscall_exit_to_user_mode+0x12/0x30
> [29151.779716]  ? do_syscall_64+0x69/0x90
> [29151.784020]  ? exc_page_fault+0x65/0x150
> [29151.788499]  entry_SYSCALL_64_after_hwframe+0x6e/0xd8
> [29151.794245] RIP: 0033:0x7ff1e643f5ab
> [29151.798336] Code: 73 01 c3 48 8b 0d 75 a8 1b 00 f7 d8 64 89 01 48 83 c8 ff c3
> 66 2e 0f 1f 84 00 00 00 00 00 90 f3 0f 1e fa b8 b0 00 00 00 0f 05 <48> 3d 01 f0
> ff ff 73 01 c3 48 8b 0d 45 a8 1b 00 f7 d8 64 89 01 48
> [29151.819504] RSP: 002b:00007ffec9103cc8 EFLAGS: 00000206 ORIG_RAX:
> 00000000000000b0
> [29151.828109] RAX: ffffffffffffffda RBX: 00005615267fdc50 RCX: 00007ff1e643f5ab
> [29151.837575] RDX: 0000000000000000 RSI: 0000000000000800 RDI: 00005615267fdcb8
> [29151.845654] RBP: 00005615267fdc50 R08: 0000000000000000 R09: 0000000000000000
> [29151.853739] R10: 00007ff1e659eac0 R11: 0000000000000206 R12: 00005615267fdcb8
> [29151.861817] R13: 0000000000000000 R14: 00005615267fdcb8 R15: 00007ffec9105ff8
> [29151.869896]  </TASK>
> [29151.872423] ---[ end trace 0000000000000000 ]---
> 
> And...
> 
> [29158.533739] restrack: ------------[ cut here ]------------
> [29158.540002] infiniband hfi1_0: BUG: RESTRACK detected leak of resources
> [29158.547499] restrack: Kernel PD object allocated by ib_isert is not freed
> [29158.555193] restrack: Kernel CQ object allocated by ib_core is not freed
> [29158.562801] restrack: Kernel QP object allocated by rdma_cm is not freed
> [29158.570395] restrack: ------------[ cut here ]------------
> 
> 
> -Denny

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: isert patch leaving resources behind
  2023-08-10 14:44 isert patch leaving resources behind Dennis Dalessandro
  2023-08-13  8:29 ` Leon Romanovsky
@ 2023-08-13 14:18 ` Sagi Grimberg
  2023-08-14 11:20   ` Saravanan Vajravel
  1 sibling, 1 reply; 11+ messages in thread
From: Sagi Grimberg @ 2023-08-13 14:18 UTC (permalink / raw)
  To: Dennis Dalessandro, OFED mailing list
  Cc: Ehab Ababneh, saravanan.vajravel, Selvin Xavier



On 8/10/23 17:44, Dennis Dalessandro wrote:
> Commit: 699826f4e30a ("IB/isert: Fix incorrect release of isert connection") is
> causing problems on OPA when we try to unload the driver after doing iSCI
> testing. Reverting this commit causes the problem to go away. Any ideas? Was
> testing done on this patch with removing/hotplugging drivers?

You are correct, the patch is wrong because it doesn't fully release the
connection in ib device surprise removals.

Perhaps this should address this issue?
--
diff --git a/drivers/infiniband/ulp/isert/ib_isert.c 
b/drivers/infiniband/ulp/isert/ib_isert.c
index 92e1e7587af8..274ac9361fe7 100644
--- a/drivers/infiniband/ulp/isert/ib_isert.c
+++ b/drivers/infiniband/ulp/isert/ib_isert.c
@@ -2570,6 +2570,9 @@ static void isert_wait_conn(struct iscsit_conn *conn)
         isert_put_unsol_pending_cmds(conn);
         isert_wait4cmds(conn);
         isert_wait4logout(isert_conn);
+
+       isert_put_conn(isert_conn);
+       conn->context = NULL;
  }

  static void isert_free_conn(struct iscsit_conn *conn)
--

^ permalink raw reply related	[flat|nested] 11+ messages in thread

* RE: isert patch leaving resources behind
  2023-08-13 14:18 ` Sagi Grimberg
@ 2023-08-14 11:20   ` Saravanan Vajravel
  2023-08-14 12:36     ` Sagi Grimberg
  0 siblings, 1 reply; 11+ messages in thread
From: Saravanan Vajravel @ 2023-08-14 11:20 UTC (permalink / raw)
  To: Sagi Grimberg, Dennis Dalessandro, OFED mailing list
  Cc: Ehab Ababneh, Selvin Xavier, leon

[-- Attachment #1: Type: text/plain, Size: 2781 bytes --]

Hi Sagi,

In surprise removal case as well, isert_free_conn() should get invoked by
iSCSI target module. Right? I am trying to understand how the kernel object
leak is possible.  In your proposed patch, isert_conn is released in
isert_wait_conn() handler. Again, isert module tries to release the
connection in isert_free_conn() handler as well. Hence it will lead
use-after-free issue.

Following are the snippet of iSCSI functions where iscsit_wait_conn()  and
iscsit_free_conn() handlers are invoked in files iscsi_target_login.c &
iscsi_target.c. We need to review if there is a possibility that
iscsit_free_conn() is not invoked in any case. If yes, we may have to fix
that.

void  iscsi_target_login_sess_out
{
.
.
.
	if (conn->conn_transport->iscsit_wait_conn)
		conn->conn_transport->iscsit_wait_conn(conn);

	if (conn->conn_transport->iscsit_free_conn)
		conn->conn_transport->iscsit_free_conn(conn);
.
}

int iscsit_close_connection(
	struct iscsit_conn *conn)
{
.
.
.
	if (conn->conn_transport->iscsit_wait_conn)
		conn->conn_transport->iscsit_wait_conn(conn);
.
.
.
	if (conn->conn_transport->iscsit_free_conn)
		conn->conn_transport->iscsit_free_conn(conn);
.
.
}

@Leon,

I don't have the kernel logs in hand now. We can reproduce the issue and
share.

Thanks & Regards,
Saravanan Vajravel
+91-80-46116256

-----Original Message-----
From: Sagi Grimberg <sagi@grimberg.me>
Sent: Sunday, August 13, 2023 7:49 PM
To: Dennis Dalessandro <dennis.dalessandro@cornelisnetworks.com>; OFED
mailing list <linux-rdma@vger.kernel.org>
Cc: Ehab Ababneh <ehab.ababneh@cornelisnetworks.com>;
saravanan.vajravel@broadcom.com; Selvin Xavier <selvin.xavier@broadcom.com>
Subject: Re: isert patch leaving resources behind



On 8/10/23 17:44, Dennis Dalessandro wrote:
> Commit: 699826f4e30a ("IB/isert: Fix incorrect release of isert
> connection") is causing problems on OPA when we try to unload the
> driver after doing iSCI testing. Reverting this commit causes the
> problem to go away. Any ideas? Was testing done on this patch with
> removing/hotplugging drivers?

You are correct, the patch is wrong because it doesn't fully release the
connection in ib device surprise removals.

Perhaps this should address this issue?
--
diff --git a/drivers/infiniband/ulp/isert/ib_isert.c
b/drivers/infiniband/ulp/isert/ib_isert.c
index 92e1e7587af8..274ac9361fe7 100644
--- a/drivers/infiniband/ulp/isert/ib_isert.c
+++ b/drivers/infiniband/ulp/isert/ib_isert.c
@@ -2570,6 +2570,9 @@ static void isert_wait_conn(struct iscsit_conn *conn)
         isert_put_unsol_pending_cmds(conn);
         isert_wait4cmds(conn);
         isert_wait4logout(isert_conn);
+
+       isert_put_conn(isert_conn);
+       conn->context = NULL;
  }

  static void isert_free_conn(struct iscsit_conn *conn)
--

[-- Attachment #2: S/MIME Cryptographic Signature --]
[-- Type: application/pkcs7-signature, Size: 4227 bytes --]

^ permalink raw reply related	[flat|nested] 11+ messages in thread

* Re: isert patch leaving resources behind
  2023-08-14 11:20   ` Saravanan Vajravel
@ 2023-08-14 12:36     ` Sagi Grimberg
  2023-08-16 11:33       ` Saravanan Vajravel
  0 siblings, 1 reply; 11+ messages in thread
From: Sagi Grimberg @ 2023-08-14 12:36 UTC (permalink / raw)
  To: Saravanan Vajravel, Dennis Dalessandro, OFED mailing list
  Cc: Ehab Ababneh, Selvin Xavier, leon



On 8/14/23 14:20, Saravanan Vajravel wrote:
> Hi Sagi,
> 
> In surprise removal case as well, isert_free_conn() should get invoked by
> iSCSI target module. Right? I am trying to understand how the kernel object
> leak is possible.  In your proposed patch, isert_conn is released in
> isert_wait_conn() handler. Again, isert module tries to release the
> connection in isert_free_conn() handler as well. Hence it will lead
> use-after-free issue.
> 
> Following are the snippet of iSCSI functions where iscsit_wait_conn()  and
> iscsit_free_conn() handlers are invoked in files iscsi_target_login.c &
> iscsi_target.c. We need to review if there is a possibility that
> iscsit_free_conn() is not invoked in any case. If yes, we may have to fix
> that.
> 
> void  iscsi_target_login_sess_out
> {
> .
> .
> .
> 	if (conn->conn_transport->iscsit_wait_conn)
> 		conn->conn_transport->iscsit_wait_conn(conn);
> 
> 	if (conn->conn_transport->iscsit_free_conn)
> 		conn->conn_transport->iscsit_free_conn(conn);
> .
> }
> 
> int iscsit_close_connection(
> 	struct iscsit_conn *conn)
> {
> .
> .
> .
> 	if (conn->conn_transport->iscsit_wait_conn)
> 		conn->conn_transport->iscsit_wait_conn(conn);
> .
> .
> .
> 	if (conn->conn_transport->iscsit_free_conn)
> 		conn->conn_transport->iscsit_free_conn(conn);
> .
> .
> }
> 
> @Leon,
> 
> I don't have the kernel logs in hand now. We can reproduce the issue and
> share.

You should look at the DEVICE_REMOVAL cm handler, it should call
iscsit_cause_connection_reinstatement. I do think that it needs to
pass it sleep=true (for device removal) so that it will wait for
conn_wait_comp...

^ permalink raw reply	[flat|nested] 11+ messages in thread

* RE: isert patch leaving resources behind
  2023-08-14 12:36     ` Sagi Grimberg
@ 2023-08-16 11:33       ` Saravanan Vajravel
  0 siblings, 0 replies; 11+ messages in thread
From: Saravanan Vajravel @ 2023-08-16 11:33 UTC (permalink / raw)
  To: Sagi Grimberg, Dennis Dalessandro, OFED mailing list
  Cc: Ehab Ababneh, Selvin Xavier, leon

[-- Attachment #1: Type: text/plain, Size: 2406 bytes --]

>> You should look at the DEVICE_REMOVAL cm handler, it should call
>> iscsit_cause_connection_reinstatement. I do think that it needs to pass
>> it sleep=true (for device removal) so that it will wait for
>> conn_wait_comp...
Ok. So it should take care of current resource leak issue. Right?

Thanks & Regards,
Saravanan Vajravel
+91-80-46116256

-----Original Message-----
From: Sagi Grimberg <sagi@grimberg.me>
Sent: Monday, August 14, 2023 6:07 PM
To: Saravanan Vajravel <saravanan.vajravel@broadcom.com>; Dennis Dalessandro
<dennis.dalessandro@cornelisnetworks.com>; OFED mailing list
<linux-rdma@vger.kernel.org>
Cc: Ehab Ababneh <ehab.ababneh@cornelisnetworks.com>; Selvin Xavier
<selvin.xavier@broadcom.com>; leon@kernel.org
Subject: Re: isert patch leaving resources behind



On 8/14/23 14:20, Saravanan Vajravel wrote:
> Hi Sagi,
>
> In surprise removal case as well, isert_free_conn() should get invoked
> by iSCSI target module. Right? I am trying to understand how the
> kernel object leak is possible.  In your proposed patch, isert_conn is
> released in
> isert_wait_conn() handler. Again, isert module tries to release the
> connection in isert_free_conn() handler as well. Hence it will lead
> use-after-free issue.
>
> Following are the snippet of iSCSI functions where iscsit_wait_conn()
> and
> iscsit_free_conn() handlers are invoked in files iscsi_target_login.c
> & iscsi_target.c. We need to review if there is a possibility that
> iscsit_free_conn() is not invoked in any case. If yes, we may have to
> fix that.
>
> void  iscsi_target_login_sess_out
> {
> .
> .
> .
> 	if (conn->conn_transport->iscsit_wait_conn)
> 		conn->conn_transport->iscsit_wait_conn(conn);
>
> 	if (conn->conn_transport->iscsit_free_conn)
> 		conn->conn_transport->iscsit_free_conn(conn);
> .
> }
>
> int iscsit_close_connection(
> 	struct iscsit_conn *conn)
> {
> .
> .
> .
> 	if (conn->conn_transport->iscsit_wait_conn)
> 		conn->conn_transport->iscsit_wait_conn(conn);
> .
> .
> .
> 	if (conn->conn_transport->iscsit_free_conn)
> 		conn->conn_transport->iscsit_free_conn(conn);
> .
> .
> }
>
> @Leon,
>
> I don't have the kernel logs in hand now. We can reproduce the issue
> and share.

You should look at the DEVICE_REMOVAL cm handler, it should call
iscsit_cause_connection_reinstatement. I do think that it needs to pass it
sleep=true (for device removal) so that it will wait for conn_wait_comp...

[-- Attachment #2: S/MIME Cryptographic Signature --]
[-- Type: application/pkcs7-signature, Size: 4227 bytes --]

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: isert patch leaving resources behind
  2023-08-13  8:29 ` Leon Romanovsky
@ 2023-08-20  9:46   ` Leon Romanovsky
  2023-08-20 14:46     ` Sagi Grimberg
  0 siblings, 1 reply; 11+ messages in thread
From: Leon Romanovsky @ 2023-08-20  9:46 UTC (permalink / raw)
  To: Dennis Dalessandro, saravanan.vajravel
  Cc: OFED mailing list, Ehab Ababneh, Selvin Xavier, Sagi Grimberg

On Sun, Aug 13, 2023 at 11:29:31AM +0300, Leon Romanovsky wrote:
> On Thu, Aug 10, 2023 at 10:44:00AM -0400, Dennis Dalessandro wrote:
> > Commit: 699826f4e30a ("IB/isert: Fix incorrect release of isert connection") is
> > causing problems on OPA when we try to unload the driver after doing iSCI
> > testing. Reverting this commit causes the problem to go away. Any ideas?
> 
> 
> Saravanan, can you please post kernel logs as you wrote "When a bunch of iSER target
> is cleared, this issue can lead to use-after-free memory issue as isert conn is twice
> released" in the reverted commit?

So what is the resolution here?

Thanks

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: isert patch leaving resources behind
  2023-08-20  9:46   ` Leon Romanovsky
@ 2023-08-20 14:46     ` Sagi Grimberg
  2023-08-20 17:33       ` Leon Romanovsky
  0 siblings, 1 reply; 11+ messages in thread
From: Sagi Grimberg @ 2023-08-20 14:46 UTC (permalink / raw)
  To: Leon Romanovsky, Dennis Dalessandro, saravanan.vajravel
  Cc: OFED mailing list, Ehab Ababneh, Selvin Xavier


>>> Commit: 699826f4e30a ("IB/isert: Fix incorrect release of isert connection") is
>>> causing problems on OPA when we try to unload the driver after doing iSCI
>>> testing. Reverting this commit causes the problem to go away. Any ideas?
>>
>>
>> Saravanan, can you please post kernel logs as you wrote "When a bunch of iSER target
>> is cleared, this issue can lead to use-after-free memory issue as isert conn is twice
>> released" in the reverted commit?
> 
> So what is the resolution here?

We need saravanan to address the reported resource leak upon
DEVICE_REMOVAL. If this stalls, I suggest we revert the offending
commit and add it again without any leaks on surprise hca removals.

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: isert patch leaving resources behind
  2023-08-20 14:46     ` Sagi Grimberg
@ 2023-08-20 17:33       ` Leon Romanovsky
  2023-08-21 10:47         ` Saravanan Vajravel
  0 siblings, 1 reply; 11+ messages in thread
From: Leon Romanovsky @ 2023-08-20 17:33 UTC (permalink / raw)
  To: Sagi Grimberg
  Cc: Dennis Dalessandro, saravanan.vajravel, OFED mailing list,
	Ehab Ababneh, Selvin Xavier

On Sun, Aug 20, 2023 at 05:46:12PM +0300, Sagi Grimberg wrote:
> 
> > > > Commit: 699826f4e30a ("IB/isert: Fix incorrect release of isert connection") is
> > > > causing problems on OPA when we try to unload the driver after doing iSCI
> > > > testing. Reverting this commit causes the problem to go away. Any ideas?
> > > 
> > > 
> > > Saravanan, can you please post kernel logs as you wrote "When a bunch of iSER target
> > > is cleared, this issue can lead to use-after-free memory issue as isert conn is twice
> > > released" in the reverted commit?
> > 
> > So what is the resolution here?
> 
> We need saravanan to address the reported resource leak upon
> DEVICE_REMOVAL. If this stalls, I suggest we revert the offending
> commit and add it again without any leaks on surprise hca removals.

Thanks, I'll keep an eye on it, if we don't get patch by EOW, I will revert it

^ permalink raw reply	[flat|nested] 11+ messages in thread

* RE: isert patch leaving resources behind
  2023-08-20 17:33       ` Leon Romanovsky
@ 2023-08-21 10:47         ` Saravanan Vajravel
  2023-08-21 11:30           ` Dennis Dalessandro
  0 siblings, 1 reply; 11+ messages in thread
From: Saravanan Vajravel @ 2023-08-21 10:47 UTC (permalink / raw)
  To: Leon Romanovsky, Sagi Grimberg
  Cc: Dennis Dalessandro, OFED mailing list, Ehab Ababneh, Selvin Xavier

[-- Attachment #1: Type: text/plain, Size: 1561 bytes --]

Ok. I will work on the resource leak issue on surprise removal and push
the patch. Until then, we can revert Commit: 699826f4e30a. I will find
some time this week to repro the resource leak issue.

Thanks & Regards,
Saravanan Vajravel
+91-80-46116256


-----Original Message-----
From: Leon Romanovsky <leon@kernel.org>
Sent: Sunday, August 20, 2023 11:04 PM
To: Sagi Grimberg <sagi@grimberg.me>
Cc: Dennis Dalessandro <dennis.dalessandro@cornelisnetworks.com>;
saravanan.vajravel@broadcom.com; OFED mailing list
<linux-rdma@vger.kernel.org>; Ehab Ababneh
<ehab.ababneh@cornelisnetworks.com>; Selvin Xavier
<selvin.xavier@broadcom.com>
Subject: Re: isert patch leaving resources behind

On Sun, Aug 20, 2023 at 05:46:12PM +0300, Sagi Grimberg wrote:
>
> > > > Commit: 699826f4e30a ("IB/isert: Fix incorrect release of isert
> > > > connection") is causing problems on OPA when we try to unload
> > > > the driver after doing iSCI testing. Reverting this commit causes
the problem to go away. Any ideas?
> > >
> > >
> > > Saravanan, can you please post kernel logs as you wrote "When a
> > > bunch of iSER target is cleared, this issue can lead to
> > > use-after-free memory issue as isert conn is twice released" in the
reverted commit?
> >
> > So what is the resolution here?
>
> We need saravanan to address the reported resource leak upon
> DEVICE_REMOVAL. If this stalls, I suggest we revert the offending
> commit and add it again without any leaks on surprise hca removals.

Thanks, I'll keep an eye on it, if we don't get patch by EOW, I will
revert it

[-- Attachment #2: S/MIME Cryptographic Signature --]
[-- Type: application/pkcs7-signature, Size: 4227 bytes --]

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: isert patch leaving resources behind
  2023-08-21 10:47         ` Saravanan Vajravel
@ 2023-08-21 11:30           ` Dennis Dalessandro
  0 siblings, 0 replies; 11+ messages in thread
From: Dennis Dalessandro @ 2023-08-21 11:30 UTC (permalink / raw)
  To: Saravanan Vajravel, Leon Romanovsky, Sagi Grimberg
  Cc: OFED mailing list, Ehab Ababneh, Selvin Xavier

On 8/21/23 6:47 AM, Saravanan Vajravel wrote:
> Ok. I will work on the resource leak issue on surprise removal and push
> the patch. Until then, we can revert Commit: 699826f4e30a. I will find
> some time this week to repro the resource leak issue.
> 
> Thanks & Regards,
> Saravanan Vajravel
> +91-80-46116256
> 
> 
> -----Original Message-----
> From: Leon Romanovsky <leon@kernel.org>
> Sent: Sunday, August 20, 2023 11:04 PM
> To: Sagi Grimberg <sagi@grimberg.me>
> Cc: Dennis Dalessandro <dennis.dalessandro@cornelisnetworks.com>;
> saravanan.vajravel@broadcom.com; OFED mailing list
> <linux-rdma@vger.kernel.org>; Ehab Ababneh
> <ehab.ababneh@cornelisnetworks.com>; Selvin Xavier
> <selvin.xavier@broadcom.com>
> Subject: Re: isert patch leaving resources behind
> 
> On Sun, Aug 20, 2023 at 05:46:12PM +0300, Sagi Grimberg wrote:
>>
>>>>> Commit: 699826f4e30a ("IB/isert: Fix incorrect release of isert
>>>>> connection") is causing problems on OPA when we try to unload
>>>>> the driver after doing iSCI testing. Reverting this commit causes
> the problem to go away. Any ideas?
>>>>
>>>>
>>>> Saravanan, can you please post kernel logs as you wrote "When a
>>>> bunch of iSER target is cleared, this issue can lead to
>>>> use-after-free memory issue as isert conn is twice released" in the
> reverted commit?
>>>
>>> So what is the resolution here?
>>
>> We need saravanan to address the reported resource leak upon
>> DEVICE_REMOVAL. If this stalls, I suggest we revert the offending
>> commit and add it again without any leaks on surprise hca removals.
> 
> Thanks, I'll keep an eye on it, if we don't get patch by EOW, I will
> revert it

Thanks for the help on this folks!

-Denny

^ permalink raw reply	[flat|nested] 11+ messages in thread

end of thread, other threads:[~2023-08-21 11:30 UTC | newest]

Thread overview: 11+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2023-08-10 14:44 isert patch leaving resources behind Dennis Dalessandro
2023-08-13  8:29 ` Leon Romanovsky
2023-08-20  9:46   ` Leon Romanovsky
2023-08-20 14:46     ` Sagi Grimberg
2023-08-20 17:33       ` Leon Romanovsky
2023-08-21 10:47         ` Saravanan Vajravel
2023-08-21 11:30           ` Dennis Dalessandro
2023-08-13 14:18 ` Sagi Grimberg
2023-08-14 11:20   ` Saravanan Vajravel
2023-08-14 12:36     ` Sagi Grimberg
2023-08-16 11:33       ` Saravanan Vajravel

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).