* null pointer in rxe_mr_copy() @ 2022-04-11 3:34 Bob Pearson 2022-04-11 5:14 ` Zhu Yanjun 2022-04-12 4:11 ` Bob Pearson 0 siblings, 2 replies; 10+ messages in thread From: Bob Pearson @ 2022-04-11 3:34 UTC (permalink / raw) To: Zhu Yanjun, linux-rdma Zhu, Since checking for mr == NULL in rxe_mr_copy fixes the problem you were seeing in rping. Perhaps it would be a good idea to apply the following patch which would tell us which of the three calls to rxe_mr_copy is failing. My suspicion is the one in read_reply() in rxe_resp.c This could be caused by a race between shutting down the qp and finishing up an RDMA read. The responder resources state machine is completely unprotected from simultaneous access by verbs code and bh code in rxe_resp.c. rxe_resp is a tasklet so all the accesses from there are serialized but if anyone makes a verbs call that touches the responder resources it could cause problems. The most likely (only?) place this could happen is qp shutdown. Bob diff --git a/drivers/infiniband/sw/rxe/rxe_mr.c b/drivers/infiniband/sw/rxe/rxe_mr.c index 60a31b718774..66184f5a4ddf 100644 --- a/drivers/infiniband/sw/rxe/rxe_mr.c +++ b/drivers/infiniband/sw/rxe/rxe_mr.c @@ -489,6 +489,7 @@ int copy_data( if (bytes > 0) { iova = sge->addr + offset; + WARN_ON(!mr); err = rxe_mr_copy(mr, iova, addr, bytes, dir); if (err) goto err2; diff --git a/drivers/infiniband/sw/rxe/rxe_resp.c b/drivers/infiniband/sw/rxe/rxe_resp.c index 1d95fab606da..6e3e86bdccd7 100644 --- a/drivers/infiniband/sw/rxe/rxe_resp.c +++ b/drivers/infiniband/sw/rxe/rxe_resp.c @@ -536,6 +536,7 @@ static enum resp_states write_data_in(struct rxe_qp *qp, int err; int data_len = payload_size(pkt); + WARN_ON(!qp->resp.mr); err = rxe_mr_copy(qp->resp.mr, qp->resp.va + qp->resp.offset, payload_addr(pkt), data_len, RXE_TO_MR_OBJ); if (err) { @@ -772,6 +773,7 @@ static enum resp_states read_reply(struct rxe_qp *qp, if (!skb) return RESPST_ERR_RNR; + WARN_ON(!mr); err = rxe_mr_copy(mr, res->read.va, payload_addr(&ack_pkt), payload, RXE_FROM_MR_OBJ); if (err) ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: null pointer in rxe_mr_copy() 2022-04-11 3:34 null pointer in rxe_mr_copy() Bob Pearson @ 2022-04-11 5:14 ` Zhu Yanjun 2022-04-11 5:34 ` Zhu Yanjun 2022-04-12 4:11 ` Bob Pearson 1 sibling, 1 reply; 10+ messages in thread From: Zhu Yanjun @ 2022-04-11 5:14 UTC (permalink / raw) To: Bob Pearson; +Cc: linux-rdma On Mon, Apr 11, 2022 at 11:34 AM Bob Pearson <rpearsonhpe@gmail.com> wrote: > > Zhu, > > Since checking for mr == NULL in rxe_mr_copy fixes the problem you were seeing in rping. > Perhaps it would be a good idea to apply the following patch which would tell us which of > the three calls to rxe_mr_copy is failing. My suspicion is the one in read_reply() Hi, Bob Yes. It is the function read_reply. kernel: ------------[ cut here ]------------ kernel: WARNING: CPU: 74 PID: 38510 at drivers/infiniband/sw/rxe/rxe_resp.c:768 rxe_responder+0x1d67/0x1dd0 [rdma_rxe] kernel: Modules linked in: rdma_rxe(OE) ip6_udp_tunnel udp_tunnel rds_rdma rds xt_CHECKSUM xt_MASQUERADE xt_conntrack ipt_REJECT nf_reject_ipv4 nft_compat nft_chain_nat nf_nat nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 nf_tables nfnetlink tun bridge stp llc vfat fat rpcrdma sunrpc rdma_ucm ib_srpt ib_isert iscsi_target_mod target_core_mod intel_rapl_msr intel_rapl_common ib_iser libiscsi scsi_transport_iscsi rdma_cm ib_cm i10nm_edac iw_cm nfit libnvdimm x86_pkg_temp_thermal intel_powerclamp coretemp ipmi_ssif kvm_intel kvm irdma iTCO_wdt iTCO_vendor_support i40e irqbypass crct10dif_pclmul crc32_pclmul ib_uverbs ghash_clmulni_intel rapl intel_cstate ib_core intel_uncore wmi_bmof pcspkr mei_me isst_if_mbox_pci isst_if_mmio acpi_ipmi isst_if_common ipmi_si i2c_i801 mei intel_pch_thermal i2c_smbus ipmi_devintf ipmi_msghandler acpi_power_meter ip_tables xfs libcrc32c sd_mod t10_pi crc64_rocksoft crc64 sg mgag200 i2c_algo_bit drm_shmem_helper drm_kms_helper syscopyarea sysfillrect ice kernel: sysimgblt fb_sys_fops ahci drm libahci crc32c_intel libata megaraid_sas tg3 wmi dm_mirror dm_region_hash dm_log dm_mod fuse [last unloaded: ip6_udp_tunnel] kernel: CPU: 74 PID: 38510 Comm: rping Kdump: loaded Tainted: G S W OE 5.18.0.RXE #14 kernel: Hardware name: Dell Inc. PowerEdge R750/06V45N, BIOS 1.2.4 05/28/2021 kernel: RIP: 0010:rxe_responder+0x1d67/0x1dd0 [rdma_rxe] kernel: Code: 24 30 48 89 44 24 30 49 8b 86 88 00 00 00 48 89 44 24 38 48 8b 73 20 48 8b 43 18 ff d0 0f 1f 00 e9 10 e3 ff ff e8 e9 52 98 ee <0f> 0b 45 8b 86 f0 00 00 00 48 8b 8c 24 e0 00 00 00 ba 01 03 00 00 kernel: RSP: 0018:ff5f5b78c7624e70 EFLAGS: 00010246 kernel: RAX: ff20346c70a1d700 RBX: ff20346c7127c040 RCX: ff20346c70a1d700 kernel: RDX: 0000000000000000 RSI: 0000000000000000 RDI: ff20346c53194000 kernel: RBP: 0000000000000040 R08: 2ebbb556a556fe7f R09: 69de575d0320dc48 kernel: R10: ff5f5b78c7624de0 R11: 00000000ee4984a4 R12: ff20346c70a1d700 kernel: R13: 0000000000000000 R14: ff20346ef0539000 R15: ff20346c70a1c528 kernel: FS: 00007ff34d49b740(0000) GS:ff20347b3fa80000(0000) knlGS:0000000000000000 kernel: CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 kernel: CR2: 00007ff40be030c0 CR3: 00000003d0634005 CR4: 0000000000771ee0 kernel: DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 kernel: DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 kernel: PKRU: 55555554 kernel: Call Trace: kernel: <IRQ> kernel: ? __local_bh_enable_ip+0x9f/0xe0 kernel: ? rxe_do_task+0x67/0xe0 [rdma_rxe] kernel: ? __local_bh_enable_ip+0x77/0xe0 kernel: rxe_do_task+0x71/0xe0 [rdma_rxe] kernel: tasklet_action_common.isra.15+0xb8/0xf0 kernel: __do_softirq+0xe4/0x48c kernel: ? rxe_do_task+0x67/0xe0 [rdma_rxe] kernel: do_softirq+0xb5/0x100 kernel: </IRQ> kernel: <TASK> kernel: __local_bh_enable_ip+0xd0/0xe0 kernel: rxe_do_task+0x67/0xe0 [rdma_rxe] kernel: rxe_post_send+0x2ff/0x4c0 [rdma_rxe] kernel: ? rdma_lookup_get_uobject+0x131/0x1e0 [ib_uverbs] kernel: ib_uverbs_post_send+0x4d5/0x700 [ib_uverbs] kernel: ib_uverbs_write+0x38f/0x5e0 [ib_uverbs] kernel: ? find_held_lock+0x2d/0x90 kernel: vfs_write+0xb8/0x370 kernel: ksys_write+0xbb/0xd0 kernel: ? syscall_trace_enter.isra.15+0x169/0x220 kernel: do_syscall_64+0x37/0x80 Zhu Yanjun in rxe_resp.c > This could be caused by a race between shutting down the qp and finishing up an RDMA read. > The responder resources state machine is completely unprotected from simultaneous access by > verbs code and bh code in rxe_resp.c. rxe_resp is a tasklet so all the accesses from there are > serialized but if anyone makes a verbs call that touches the responder resources it could > cause problems. The most likely (only?) place this could happen is qp shutdown. > > Bob > > > > diff --git a/drivers/infiniband/sw/rxe/rxe_mr.c b/drivers/infiniband/sw/rxe/rxe_mr.c > > index 60a31b718774..66184f5a4ddf 100644 > > --- a/drivers/infiniband/sw/rxe/rxe_mr.c > > +++ b/drivers/infiniband/sw/rxe/rxe_mr.c > > @@ -489,6 +489,7 @@ int copy_data( > > if (bytes > 0) { > > iova = sge->addr + offset; > > > > + WARN_ON(!mr); > > err = rxe_mr_copy(mr, iova, addr, bytes, dir); > > if (err) > > goto err2; > > diff --git a/drivers/infiniband/sw/rxe/rxe_resp.c b/drivers/infiniband/sw/rxe/rxe_resp.c > > index 1d95fab606da..6e3e86bdccd7 100644 > > --- a/drivers/infiniband/sw/rxe/rxe_resp.c > > +++ b/drivers/infiniband/sw/rxe/rxe_resp.c > > @@ -536,6 +536,7 @@ static enum resp_states write_data_in(struct rxe_qp *qp, > > int err; > > int data_len = payload_size(pkt); > > > > + WARN_ON(!qp->resp.mr); > > err = rxe_mr_copy(qp->resp.mr, qp->resp.va + qp->resp.offset, > > payload_addr(pkt), data_len, RXE_TO_MR_OBJ); > > if (err) { > > @@ -772,6 +773,7 @@ static enum resp_states read_reply(struct rxe_qp *qp, > > if (!skb) > > return RESPST_ERR_RNR; > > > > + WARN_ON(!mr); > > err = rxe_mr_copy(mr, res->read.va, payload_addr(&ack_pkt), > > payload, RXE_FROM_MR_OBJ); > > if (err) > ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: null pointer in rxe_mr_copy() 2022-04-11 5:14 ` Zhu Yanjun @ 2022-04-11 5:34 ` Zhu Yanjun 2022-04-11 16:25 ` Pearson, Robert B 2022-05-24 13:18 ` yangx.jy 0 siblings, 2 replies; 10+ messages in thread From: Zhu Yanjun @ 2022-04-11 5:34 UTC (permalink / raw) To: Bob Pearson; +Cc: linux-rdma On Mon, Apr 11, 2022 at 1:14 PM Zhu Yanjun <zyjzyj2000@gmail.com> wrote: > > On Mon, Apr 11, 2022 at 11:34 AM Bob Pearson <rpearsonhpe@gmail.com> wrote: > > > > Zhu, > > > > Since checking for mr == NULL in rxe_mr_copy fixes the problem you were seeing in rping. > > Perhaps it would be a good idea to apply the following patch which would tell us which of > > the three calls to rxe_mr_copy is failing. My suspicion is the one in read_reply() > Hi, Bob > > Yes. It is the function read_reply. 720 static enum resp_states read_reply(struct rxe_qp *qp, 721 struct rxe_pkt_info *req_pkt) 722 { 723 struct rxe_pkt_info ack_pkt; 724 struct sk_buff *skb; 725 int mtu = qp->mtu; 726 enum resp_states state; 727 int payload; 728 int opcode; 729 int err; 730 struct resp_res *res = qp->resp.res; 731 struct rxe_mr *mr; 732 733 if (!res) { 734 res = rxe_prepare_read_res(qp, req_pkt); 735 qp->resp.res = res; 736 } 737 738 if (res->state == rdatm_res_state_new) { 739 mr = qp->resp.mr; <----It seems that mr is from here. 740 qp->resp.mr = NULL; 741 > > kernel: ------------[ cut here ]------------ > kernel: WARNING: CPU: 74 PID: 38510 at > drivers/infiniband/sw/rxe/rxe_resp.c:768 rxe_responder+0x1d67/0x1dd0 > [rdma_rxe] > kernel: Modules linked in: rdma_rxe(OE) ip6_udp_tunnel udp_tunnel > rds_rdma rds xt_CHECKSUM xt_MASQUERADE xt_conntrack ipt_REJECT > nf_reject_ipv4 nft_compat nft_chain_nat nf_nat nf_conntrack > nf_defrag_ipv6 nf_defrag_ipv4 nf_tables nfnetlink tun bridge stp llc > vfat fat rpcrdma sunrpc rdma_ucm ib_srpt ib_isert iscsi_target_mod > target_core_mod intel_rapl_msr intel_rapl_common ib_iser libiscsi > scsi_transport_iscsi rdma_cm ib_cm i10nm_edac iw_cm nfit libnvdimm > x86_pkg_temp_thermal intel_powerclamp coretemp ipmi_ssif kvm_intel kvm > irdma iTCO_wdt iTCO_vendor_support i40e irqbypass crct10dif_pclmul > crc32_pclmul ib_uverbs ghash_clmulni_intel rapl intel_cstate ib_core > intel_uncore wmi_bmof pcspkr mei_me isst_if_mbox_pci isst_if_mmio > acpi_ipmi isst_if_common ipmi_si i2c_i801 mei intel_pch_thermal > i2c_smbus ipmi_devintf ipmi_msghandler acpi_power_meter ip_tables xfs > libcrc32c sd_mod t10_pi crc64_rocksoft crc64 sg mgag200 i2c_algo_bit > drm_shmem_helper drm_kms_helper syscopyarea sysfillrect ice > kernel: sysimgblt fb_sys_fops ahci drm libahci crc32c_intel libata > megaraid_sas tg3 wmi dm_mirror dm_region_hash dm_log dm_mod fuse [last > unloaded: ip6_udp_tunnel] > kernel: CPU: 74 PID: 38510 Comm: rping Kdump: loaded Tainted: G S > W OE 5.18.0.RXE #14 > kernel: Hardware name: Dell Inc. PowerEdge R750/06V45N, BIOS 1.2.4 05/28/2021 > kernel: RIP: 0010:rxe_responder+0x1d67/0x1dd0 [rdma_rxe] > kernel: Code: 24 30 48 89 44 24 30 49 8b 86 88 00 00 00 48 89 44 24 > 38 48 8b 73 20 48 8b 43 18 ff d0 0f 1f 00 e9 10 e3 ff ff e8 e9 52 98 > ee <0f> 0b 45 8b 86 f0 00 00 00 48 8b 8c 24 e0 00 00 00 ba 01 03 00 00 > kernel: RSP: 0018:ff5f5b78c7624e70 EFLAGS: 00010246 > kernel: RAX: ff20346c70a1d700 RBX: ff20346c7127c040 RCX: ff20346c70a1d700 > kernel: RDX: 0000000000000000 RSI: 0000000000000000 RDI: ff20346c53194000 > kernel: RBP: 0000000000000040 R08: 2ebbb556a556fe7f R09: 69de575d0320dc48 > kernel: R10: ff5f5b78c7624de0 R11: 00000000ee4984a4 R12: ff20346c70a1d700 > kernel: R13: 0000000000000000 R14: ff20346ef0539000 R15: ff20346c70a1c528 > kernel: FS: 00007ff34d49b740(0000) GS:ff20347b3fa80000(0000) > knlGS:0000000000000000 > kernel: CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 > kernel: CR2: 00007ff40be030c0 CR3: 00000003d0634005 CR4: 0000000000771ee0 > kernel: DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 > kernel: DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 > kernel: PKRU: 55555554 > kernel: Call Trace: > kernel: <IRQ> > kernel: ? __local_bh_enable_ip+0x9f/0xe0 > kernel: ? rxe_do_task+0x67/0xe0 [rdma_rxe] > kernel: ? __local_bh_enable_ip+0x77/0xe0 > kernel: rxe_do_task+0x71/0xe0 [rdma_rxe] > kernel: tasklet_action_common.isra.15+0xb8/0xf0 > kernel: __do_softirq+0xe4/0x48c > kernel: ? rxe_do_task+0x67/0xe0 [rdma_rxe] > kernel: do_softirq+0xb5/0x100 > kernel: </IRQ> > kernel: <TASK> > kernel: __local_bh_enable_ip+0xd0/0xe0 > kernel: rxe_do_task+0x67/0xe0 [rdma_rxe] > kernel: rxe_post_send+0x2ff/0x4c0 [rdma_rxe] > kernel: ? rdma_lookup_get_uobject+0x131/0x1e0 [ib_uverbs] > kernel: ib_uverbs_post_send+0x4d5/0x700 [ib_uverbs] > kernel: ib_uverbs_write+0x38f/0x5e0 [ib_uverbs] > kernel: ? find_held_lock+0x2d/0x90 > kernel: vfs_write+0xb8/0x370 > kernel: ksys_write+0xbb/0xd0 > kernel: ? syscall_trace_enter.isra.15+0x169/0x220 > kernel: do_syscall_64+0x37/0x80 > > Zhu Yanjun > > in rxe_resp.c > > This could be caused by a race between shutting down the qp and finishing up an RDMA read. > > The responder resources state machine is completely unprotected from simultaneous access by > > verbs code and bh code in rxe_resp.c. rxe_resp is a tasklet so all the accesses from there are > > serialized but if anyone makes a verbs call that touches the responder resources it could > > cause problems. The most likely (only?) place this could happen is qp shutdown. > > > > Bob > > > > > > > > diff --git a/drivers/infiniband/sw/rxe/rxe_mr.c b/drivers/infiniband/sw/rxe/rxe_mr.c > > > > index 60a31b718774..66184f5a4ddf 100644 > > > > --- a/drivers/infiniband/sw/rxe/rxe_mr.c > > > > +++ b/drivers/infiniband/sw/rxe/rxe_mr.c > > > > @@ -489,6 +489,7 @@ int copy_data( > > > > if (bytes > 0) { > > > > iova = sge->addr + offset; > > > > > > > > + WARN_ON(!mr); > > > > err = rxe_mr_copy(mr, iova, addr, bytes, dir); > > > > if (err) > > > > goto err2; > > > > diff --git a/drivers/infiniband/sw/rxe/rxe_resp.c b/drivers/infiniband/sw/rxe/rxe_resp.c > > > > index 1d95fab606da..6e3e86bdccd7 100644 > > > > --- a/drivers/infiniband/sw/rxe/rxe_resp.c > > > > +++ b/drivers/infiniband/sw/rxe/rxe_resp.c > > > > @@ -536,6 +536,7 @@ static enum resp_states write_data_in(struct rxe_qp *qp, > > > > int err; > > > > int data_len = payload_size(pkt); > > > > > > > > + WARN_ON(!qp->resp.mr); > > > > err = rxe_mr_copy(qp->resp.mr, qp->resp.va + qp->resp.offset, > > > > payload_addr(pkt), data_len, RXE_TO_MR_OBJ); > > > > if (err) { > > > > @@ -772,6 +773,7 @@ static enum resp_states read_reply(struct rxe_qp *qp, > > > > if (!skb) > > > > return RESPST_ERR_RNR; > > > > > > > > + WARN_ON(!mr); > > > > err = rxe_mr_copy(mr, res->read.va, payload_addr(&ack_pkt), > > > > payload, RXE_FROM_MR_OBJ); > > > > if (err) > > ^ permalink raw reply [flat|nested] 10+ messages in thread
* RE: null pointer in rxe_mr_copy() 2022-04-11 5:34 ` Zhu Yanjun @ 2022-04-11 16:25 ` Pearson, Robert B 2022-04-12 3:13 ` Bob Pearson 2022-05-24 13:18 ` yangx.jy 1 sibling, 1 reply; 10+ messages in thread From: Pearson, Robert B @ 2022-04-11 16:25 UTC (permalink / raw) To: Zhu Yanjun, Bob Pearson; +Cc: linux-rdma Zhu, Would you be willing to try the v13 pool patch series. It also fixes the blktests bug. (You have to apply Bart's scsi_debug revert patch to fix that issue.) I think it may also fix this issue because it is way more careful about deferring qp cleanup code until after all the packets have completed. The bug you are seeing feels like a race with qp destroy. Bob -----Original Message----- From: Zhu Yanjun <zyjzyj2000@gmail.com> Sent: Monday, April 11, 2022 12:34 AM To: Bob Pearson <rpearsonhpe@gmail.com> Cc: linux-rdma@vger.kernel.org Subject: Re: null pointer in rxe_mr_copy() On Mon, Apr 11, 2022 at 1:14 PM Zhu Yanjun <zyjzyj2000@gmail.com> wrote: > > On Mon, Apr 11, 2022 at 11:34 AM Bob Pearson <rpearsonhpe@gmail.com> wrote: > > > > Zhu, > > > > Since checking for mr == NULL in rxe_mr_copy fixes the problem you were seeing in rping. > > Perhaps it would be a good idea to apply the following patch which > > would tell us which of the three calls to rxe_mr_copy is failing. My > > suspicion is the one in read_reply() > Hi, Bob > > Yes. It is the function read_reply. 720 static enum resp_states read_reply(struct rxe_qp *qp, 721 struct rxe_pkt_info *req_pkt) 722 { 723 struct rxe_pkt_info ack_pkt; 724 struct sk_buff *skb; 725 int mtu = qp->mtu; 726 enum resp_states state; 727 int payload; 728 int opcode; 729 int err; 730 struct resp_res *res = qp->resp.res; 731 struct rxe_mr *mr; 732 733 if (!res) { 734 res = rxe_prepare_read_res(qp, req_pkt); 735 qp->resp.res = res; 736 } 737 738 if (res->state == rdatm_res_state_new) { 739 mr = qp->resp.mr; <----It seems that mr is from here. 740 qp->resp.mr = NULL; 741 > > kernel: ------------[ cut here ]------------ > kernel: WARNING: CPU: 74 PID: 38510 at > drivers/infiniband/sw/rxe/rxe_resp.c:768 rxe_responder+0x1d67/0x1dd0 > [rdma_rxe] > kernel: Modules linked in: rdma_rxe(OE) ip6_udp_tunnel udp_tunnel > rds_rdma rds xt_CHECKSUM xt_MASQUERADE xt_conntrack ipt_REJECT > nf_reject_ipv4 nft_compat nft_chain_nat nf_nat nf_conntrack > nf_defrag_ipv6 nf_defrag_ipv4 nf_tables nfnetlink tun bridge stp llc > vfat fat rpcrdma sunrpc rdma_ucm ib_srpt ib_isert iscsi_target_mod > target_core_mod intel_rapl_msr intel_rapl_common ib_iser libiscsi > scsi_transport_iscsi rdma_cm ib_cm i10nm_edac iw_cm nfit libnvdimm > x86_pkg_temp_thermal intel_powerclamp coretemp ipmi_ssif kvm_intel kvm > irdma iTCO_wdt iTCO_vendor_support i40e irqbypass crct10dif_pclmul > crc32_pclmul ib_uverbs ghash_clmulni_intel rapl intel_cstate ib_core > intel_uncore wmi_bmof pcspkr mei_me isst_if_mbox_pci isst_if_mmio > acpi_ipmi isst_if_common ipmi_si i2c_i801 mei intel_pch_thermal > i2c_smbus ipmi_devintf ipmi_msghandler acpi_power_meter ip_tables xfs > libcrc32c sd_mod t10_pi crc64_rocksoft crc64 sg mgag200 i2c_algo_bit > drm_shmem_helper drm_kms_helper syscopyarea sysfillrect ice > kernel: sysimgblt fb_sys_fops ahci drm libahci crc32c_intel libata > megaraid_sas tg3 wmi dm_mirror dm_region_hash dm_log dm_mod fuse [last > unloaded: ip6_udp_tunnel] > kernel: CPU: 74 PID: 38510 Comm: rping Kdump: loaded Tainted: G S > W OE 5.18.0.RXE #14 > kernel: Hardware name: Dell Inc. PowerEdge R750/06V45N, BIOS 1.2.4 > 05/28/2021 > kernel: RIP: 0010:rxe_responder+0x1d67/0x1dd0 [rdma_rxe] > kernel: Code: 24 30 48 89 44 24 30 49 8b 86 88 00 00 00 48 89 44 24 > 38 48 8b 73 20 48 8b 43 18 ff d0 0f 1f 00 e9 10 e3 ff ff e8 e9 52 98 > ee <0f> 0b 45 8b 86 f0 00 00 00 48 8b 8c 24 e0 00 00 00 ba 01 03 00 00 > kernel: RSP: 0018:ff5f5b78c7624e70 EFLAGS: 00010246 > kernel: RAX: ff20346c70a1d700 RBX: ff20346c7127c040 RCX: > ff20346c70a1d700 > kernel: RDX: 0000000000000000 RSI: 0000000000000000 RDI: > ff20346c53194000 > kernel: RBP: 0000000000000040 R08: 2ebbb556a556fe7f R09: > 69de575d0320dc48 > kernel: R10: ff5f5b78c7624de0 R11: 00000000ee4984a4 R12: > ff20346c70a1d700 > kernel: R13: 0000000000000000 R14: ff20346ef0539000 R15: > ff20346c70a1c528 > kernel: FS: 00007ff34d49b740(0000) GS:ff20347b3fa80000(0000) > knlGS:0000000000000000 > kernel: CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 > kernel: CR2: 00007ff40be030c0 CR3: 00000003d0634005 CR4: > 0000000000771ee0 > kernel: DR0: 0000000000000000 DR1: 0000000000000000 DR2: > 0000000000000000 > kernel: DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: > 0000000000000400 > kernel: PKRU: 55555554 > kernel: Call Trace: > kernel: <IRQ> > kernel: ? __local_bh_enable_ip+0x9f/0xe0 > kernel: ? rxe_do_task+0x67/0xe0 [rdma_rxe] > kernel: ? __local_bh_enable_ip+0x77/0xe0 > kernel: rxe_do_task+0x71/0xe0 [rdma_rxe] > kernel: tasklet_action_common.isra.15+0xb8/0xf0 > kernel: __do_softirq+0xe4/0x48c > kernel: ? rxe_do_task+0x67/0xe0 [rdma_rxe] > kernel: do_softirq+0xb5/0x100 > kernel: </IRQ> > kernel: <TASK> > kernel: __local_bh_enable_ip+0xd0/0xe0 > kernel: rxe_do_task+0x67/0xe0 [rdma_rxe] > kernel: rxe_post_send+0x2ff/0x4c0 [rdma_rxe] > kernel: ? rdma_lookup_get_uobject+0x131/0x1e0 [ib_uverbs] > kernel: ib_uverbs_post_send+0x4d5/0x700 [ib_uverbs] > kernel: ib_uverbs_write+0x38f/0x5e0 [ib_uverbs] > kernel: ? find_held_lock+0x2d/0x90 > kernel: vfs_write+0xb8/0x370 > kernel: ksys_write+0xbb/0xd0 > kernel: ? syscall_trace_enter.isra.15+0x169/0x220 > kernel: do_syscall_64+0x37/0x80 > > Zhu Yanjun > > in rxe_resp.c > > This could be caused by a race between shutting down the qp and finishing up an RDMA read. > > The responder resources state machine is completely unprotected from > > simultaneous access by verbs code and bh code in rxe_resp.c. > > rxe_resp is a tasklet so all the accesses from there are serialized > > but if anyone makes a verbs call that touches the responder resources it could cause problems. The most likely (only?) place this could happen is qp shutdown. > > > > Bob > > > > > > > > diff --git a/drivers/infiniband/sw/rxe/rxe_mr.c > > b/drivers/infiniband/sw/rxe/rxe_mr.c > > > > index 60a31b718774..66184f5a4ddf 100644 > > > > --- a/drivers/infiniband/sw/rxe/rxe_mr.c > > > > +++ b/drivers/infiniband/sw/rxe/rxe_mr.c > > > > @@ -489,6 +489,7 @@ int copy_data( > > > > if (bytes > 0) { > > > > iova = sge->addr + offset; > > > > > > > > + WARN_ON(!mr); > > > > err = rxe_mr_copy(mr, iova, addr, bytes, > > dir); > > > > if (err) > > > > goto err2; > > > > diff --git a/drivers/infiniband/sw/rxe/rxe_resp.c > > b/drivers/infiniband/sw/rxe/rxe_resp.c > > > > index 1d95fab606da..6e3e86bdccd7 100644 > > > > --- a/drivers/infiniband/sw/rxe/rxe_resp.c > > > > +++ b/drivers/infiniband/sw/rxe/rxe_resp.c > > > > @@ -536,6 +536,7 @@ static enum resp_states write_data_in(struct > > rxe_qp *qp, > > > > int err; > > > > int data_len = payload_size(pkt); > > > > > > > > + WARN_ON(!qp->resp.mr); > > > > err = rxe_mr_copy(qp->resp.mr, qp->resp.va + > > qp->resp.offset, > > > > payload_addr(pkt), data_len, > > RXE_TO_MR_OBJ); > > > > if (err) { > > > > @@ -772,6 +773,7 @@ static enum resp_states read_reply(struct rxe_qp > > *qp, > > > > if (!skb) > > > > return RESPST_ERR_RNR; > > > > > > > > + WARN_ON(!mr); > > > > err = rxe_mr_copy(mr, res->read.va, payload_addr(&ack_pkt), > > > > payload, RXE_FROM_MR_OBJ); > > > > if (err) > > ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: null pointer in rxe_mr_copy() 2022-04-11 16:25 ` Pearson, Robert B @ 2022-04-12 3:13 ` Bob Pearson 2022-04-12 10:04 ` Yanjun Zhu 0 siblings, 1 reply; 10+ messages in thread From: Bob Pearson @ 2022-04-12 3:13 UTC (permalink / raw) To: Pearson, Robert B, Zhu Yanjun; +Cc: linux-rdma On 4/11/22 11:25, Pearson, Robert B wrote: > Zhu, > > Would you be willing to try the v13 pool patch series. It also fixes the blktests bug. > (You have to apply Bart's scsi_debug revert patch to fix that issue.) > I think it may also fix this issue because it is way more careful about deferring qp cleanup > code until after all the packets have completed. > > The bug you are seeing feels like a race with qp destroy. > > Bob > > -----Original Message----- > From: Zhu Yanjun <zyjzyj2000@gmail.com> > Sent: Monday, April 11, 2022 12:34 AM > To: Bob Pearson <rpearsonhpe@gmail.com> > Cc: linux-rdma@vger.kernel.org > Subject: Re: null pointer in rxe_mr_copy() > > On Mon, Apr 11, 2022 at 1:14 PM Zhu Yanjun <zyjzyj2000@gmail.com> wrote: >> >> On Mon, Apr 11, 2022 at 11:34 AM Bob Pearson <rpearsonhpe@gmail.com> wrote: >>> >>> Zhu, >>> >>> Since checking for mr == NULL in rxe_mr_copy fixes the problem you were seeing in rping. >>> Perhaps it would be a good idea to apply the following patch which >>> would tell us which of the three calls to rxe_mr_copy is failing. My >>> suspicion is the one in read_reply() >> Hi, Bob >> >> Yes. It is the function read_reply. > > 720 static enum resp_states read_reply(struct rxe_qp *qp, > 721 struct rxe_pkt_info *req_pkt) > 722 { > 723 struct rxe_pkt_info ack_pkt; > 724 struct sk_buff *skb; > 725 int mtu = qp->mtu; > 726 enum resp_states state; > 727 int payload; > 728 int opcode; > 729 int err; > 730 struct resp_res *res = qp->resp.res; > 731 struct rxe_mr *mr; > 732 > 733 if (!res) { > 734 res = rxe_prepare_read_res(qp, req_pkt); > 735 qp->resp.res = res; > 736 } > 737 > 738 if (res->state == rdatm_res_state_new) { > 739 mr = qp->resp.mr; > <----It seems that mr is from here. > 740 qp->resp.mr = NULL; > 741 > > >> >> kernel: ------------[ cut here ]------------ >> kernel: WARNING: CPU: 74 PID: 38510 at >> drivers/infiniband/sw/rxe/rxe_resp.c:768 rxe_responder+0x1d67/0x1dd0 >> [rdma_rxe] >> kernel: Modules linked in: rdma_rxe(OE) ip6_udp_tunnel udp_tunnel >> rds_rdma rds xt_CHECKSUM xt_MASQUERADE xt_conntrack ipt_REJECT >> nf_reject_ipv4 nft_compat nft_chain_nat nf_nat nf_conntrack >> nf_defrag_ipv6 nf_defrag_ipv4 nf_tables nfnetlink tun bridge stp llc >> vfat fat rpcrdma sunrpc rdma_ucm ib_srpt ib_isert iscsi_target_mod >> target_core_mod intel_rapl_msr intel_rapl_common ib_iser libiscsi >> scsi_transport_iscsi rdma_cm ib_cm i10nm_edac iw_cm nfit libnvdimm >> x86_pkg_temp_thermal intel_powerclamp coretemp ipmi_ssif kvm_intel kvm >> irdma iTCO_wdt iTCO_vendor_support i40e irqbypass crct10dif_pclmul >> crc32_pclmul ib_uverbs ghash_clmulni_intel rapl intel_cstate ib_core >> intel_uncore wmi_bmof pcspkr mei_me isst_if_mbox_pci isst_if_mmio >> acpi_ipmi isst_if_common ipmi_si i2c_i801 mei intel_pch_thermal >> i2c_smbus ipmi_devintf ipmi_msghandler acpi_power_meter ip_tables xfs >> libcrc32c sd_mod t10_pi crc64_rocksoft crc64 sg mgag200 i2c_algo_bit >> drm_shmem_helper drm_kms_helper syscopyarea sysfillrect ice >> kernel: sysimgblt fb_sys_fops ahci drm libahci crc32c_intel libata >> megaraid_sas tg3 wmi dm_mirror dm_region_hash dm_log dm_mod fuse [last >> unloaded: ip6_udp_tunnel] >> kernel: CPU: 74 PID: 38510 Comm: rping Kdump: loaded Tainted: G S >> W OE 5.18.0.RXE #14 >> kernel: Hardware name: Dell Inc. PowerEdge R750/06V45N, BIOS 1.2.4 >> 05/28/2021 >> kernel: RIP: 0010:rxe_responder+0x1d67/0x1dd0 [rdma_rxe] >> kernel: Code: 24 30 48 89 44 24 30 49 8b 86 88 00 00 00 48 89 44 24 >> 38 48 8b 73 20 48 8b 43 18 ff d0 0f 1f 00 e9 10 e3 ff ff e8 e9 52 98 >> ee <0f> 0b 45 8b 86 f0 00 00 00 48 8b 8c 24 e0 00 00 00 ba 01 03 00 00 >> kernel: RSP: 0018:ff5f5b78c7624e70 EFLAGS: 00010246 >> kernel: RAX: ff20346c70a1d700 RBX: ff20346c7127c040 RCX: >> ff20346c70a1d700 >> kernel: RDX: 0000000000000000 RSI: 0000000000000000 RDI: >> ff20346c53194000 >> kernel: RBP: 0000000000000040 R08: 2ebbb556a556fe7f R09: >> 69de575d0320dc48 >> kernel: R10: ff5f5b78c7624de0 R11: 00000000ee4984a4 R12: >> ff20346c70a1d700 >> kernel: R13: 0000000000000000 R14: ff20346ef0539000 R15: >> ff20346c70a1c528 >> kernel: FS: 00007ff34d49b740(0000) GS:ff20347b3fa80000(0000) >> knlGS:0000000000000000 >> kernel: CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 >> kernel: CR2: 00007ff40be030c0 CR3: 00000003d0634005 CR4: >> 0000000000771ee0 >> kernel: DR0: 0000000000000000 DR1: 0000000000000000 DR2: >> 0000000000000000 >> kernel: DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: >> 0000000000000400 >> kernel: PKRU: 55555554 >> kernel: Call Trace: >> kernel: <IRQ> >> kernel: ? __local_bh_enable_ip+0x9f/0xe0 >> kernel: ? rxe_do_task+0x67/0xe0 [rdma_rxe] >> kernel: ? __local_bh_enable_ip+0x77/0xe0 >> kernel: rxe_do_task+0x71/0xe0 [rdma_rxe] >> kernel: tasklet_action_common.isra.15+0xb8/0xf0 >> kernel: __do_softirq+0xe4/0x48c >> kernel: ? rxe_do_task+0x67/0xe0 [rdma_rxe] >> kernel: do_softirq+0xb5/0x100 >> kernel: </IRQ> >> kernel: <TASK> >> kernel: __local_bh_enable_ip+0xd0/0xe0 >> kernel: rxe_do_task+0x67/0xe0 [rdma_rxe] >> kernel: rxe_post_send+0x2ff/0x4c0 [rdma_rxe] >> kernel: ? rdma_lookup_get_uobject+0x131/0x1e0 [ib_uverbs] >> kernel: ib_uverbs_post_send+0x4d5/0x700 [ib_uverbs] >> kernel: ib_uverbs_write+0x38f/0x5e0 [ib_uverbs] >> kernel: ? find_held_lock+0x2d/0x90 >> kernel: vfs_write+0xb8/0x370 >> kernel: ksys_write+0xbb/0xd0 >> kernel: ? syscall_trace_enter.isra.15+0x169/0x220 >> kernel: do_syscall_64+0x37/0x80 >> >> Zhu Yanjun >> >> in rxe_resp.c >>> This could be caused by a race between shutting down the qp and finishing up an RDMA read. >>> The responder resources state machine is completely unprotected from >>> simultaneous access by verbs code and bh code in rxe_resp.c. >>> rxe_resp is a tasklet so all the accesses from there are serialized >>> but if anyone makes a verbs call that touches the responder resources it could cause problems. The most likely (only?) place this could happen is qp shutdown. >>> >>> Bob >>> >>> >>> >>> diff --git a/drivers/infiniband/sw/rxe/rxe_mr.c >>> b/drivers/infiniband/sw/rxe/rxe_mr.c >>> >>> index 60a31b718774..66184f5a4ddf 100644 >>> >>> --- a/drivers/infiniband/sw/rxe/rxe_mr.c >>> >>> +++ b/drivers/infiniband/sw/rxe/rxe_mr.c >>> >>> @@ -489,6 +489,7 @@ int copy_data( >>> >>> if (bytes > 0) { >>> >>> iova = sge->addr + offset; >>> >>> >>> >>> + WARN_ON(!mr); >>> >>> err = rxe_mr_copy(mr, iova, addr, bytes, >>> dir); >>> >>> if (err) >>> >>> goto err2; >>> >>> diff --git a/drivers/infiniband/sw/rxe/rxe_resp.c >>> b/drivers/infiniband/sw/rxe/rxe_resp.c >>> >>> index 1d95fab606da..6e3e86bdccd7 100644 >>> >>> --- a/drivers/infiniband/sw/rxe/rxe_resp.c >>> >>> +++ b/drivers/infiniband/sw/rxe/rxe_resp.c >>> >>> @@ -536,6 +536,7 @@ static enum resp_states write_data_in(struct >>> rxe_qp *qp, >>> >>> int err; >>> >>> int data_len = payload_size(pkt); >>> >>> >>> >>> + WARN_ON(!qp->resp.mr); >>> >>> err = rxe_mr_copy(qp->resp.mr, qp->resp.va + >>> qp->resp.offset, >>> >>> payload_addr(pkt), data_len, >>> RXE_TO_MR_OBJ); >>> >>> if (err) { >>> >>> @@ -772,6 +773,7 @@ static enum resp_states read_reply(struct rxe_qp >>> *qp, >>> >>> if (!skb) >>> >>> return RESPST_ERR_RNR; >>> >>> >>> >>> + WARN_ON(!mr); >>> >>> err = rxe_mr_copy(mr, res->read.va, payload_addr(&ack_pkt), >>> >>> payload, RXE_FROM_MR_OBJ); >>> >>> if (err) >>> When you run rping are you going between two machines? It doesn't work in loopback as far as I can tell. Bob ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: null pointer in rxe_mr_copy() 2022-04-12 3:13 ` Bob Pearson @ 2022-04-12 10:04 ` Yanjun Zhu 0 siblings, 0 replies; 10+ messages in thread From: Yanjun Zhu @ 2022-04-12 10:04 UTC (permalink / raw) To: Bob Pearson, Pearson, Robert B, Zhu Yanjun; +Cc: linux-rdma 在 2022/4/12 11:13, Bob Pearson 写道: > On 4/11/22 11:25, Pearson, Robert B wrote: >> Zhu, >> >> Would you be willing to try the v13 pool patch series. It also fixes the blktests bug. >> (You have to apply Bart's scsi_debug revert patch to fix that issue.) >> I think it may also fix this issue because it is way more careful about deferring qp cleanup >> code until after all the packets have completed. >> >> The bug you are seeing feels like a race with qp destroy. >> >> Bob >> >> -----Original Message----- >> From: Zhu Yanjun <zyjzyj2000@gmail.com> >> Sent: Monday, April 11, 2022 12:34 AM >> To: Bob Pearson <rpearsonhpe@gmail.com> >> Cc: linux-rdma@vger.kernel.org >> Subject: Re: null pointer in rxe_mr_copy() >> >> On Mon, Apr 11, 2022 at 1:14 PM Zhu Yanjun <zyjzyj2000@gmail.com> wrote: >>> >>> On Mon, Apr 11, 2022 at 11:34 AM Bob Pearson <rpearsonhpe@gmail.com> wrote: >>>> >>>> Zhu, >>>> >>>> Since checking for mr == NULL in rxe_mr_copy fixes the problem you were seeing in rping. >>>> Perhaps it would be a good idea to apply the following patch which >>>> would tell us which of the three calls to rxe_mr_copy is failing. My >>>> suspicion is the one in read_reply() >>> Hi, Bob >>> >>> Yes. It is the function read_reply. >> >> 720 static enum resp_states read_reply(struct rxe_qp *qp, >> 721 struct rxe_pkt_info *req_pkt) >> 722 { >> 723 struct rxe_pkt_info ack_pkt; >> 724 struct sk_buff *skb; >> 725 int mtu = qp->mtu; >> 726 enum resp_states state; >> 727 int payload; >> 728 int opcode; >> 729 int err; >> 730 struct resp_res *res = qp->resp.res; >> 731 struct rxe_mr *mr; >> 732 >> 733 if (!res) { >> 734 res = rxe_prepare_read_res(qp, req_pkt); >> 735 qp->resp.res = res; >> 736 } >> 737 >> 738 if (res->state == rdatm_res_state_new) { >> 739 mr = qp->resp.mr; >> <----It seems that mr is from here. >> 740 qp->resp.mr = NULL; >> 741 >> >> >>> >>> kernel: ------------[ cut here ]------------ >>> kernel: WARNING: CPU: 74 PID: 38510 at >>> drivers/infiniband/sw/rxe/rxe_resp.c:768 rxe_responder+0x1d67/0x1dd0 >>> [rdma_rxe] >>> kernel: Modules linked in: rdma_rxe(OE) ip6_udp_tunnel udp_tunnel >>> rds_rdma rds xt_CHECKSUM xt_MASQUERADE xt_conntrack ipt_REJECT >>> nf_reject_ipv4 nft_compat nft_chain_nat nf_nat nf_conntrack >>> nf_defrag_ipv6 nf_defrag_ipv4 nf_tables nfnetlink tun bridge stp llc >>> vfat fat rpcrdma sunrpc rdma_ucm ib_srpt ib_isert iscsi_target_mod >>> target_core_mod intel_rapl_msr intel_rapl_common ib_iser libiscsi >>> scsi_transport_iscsi rdma_cm ib_cm i10nm_edac iw_cm nfit libnvdimm >>> x86_pkg_temp_thermal intel_powerclamp coretemp ipmi_ssif kvm_intel kvm >>> irdma iTCO_wdt iTCO_vendor_support i40e irqbypass crct10dif_pclmul >>> crc32_pclmul ib_uverbs ghash_clmulni_intel rapl intel_cstate ib_core >>> intel_uncore wmi_bmof pcspkr mei_me isst_if_mbox_pci isst_if_mmio >>> acpi_ipmi isst_if_common ipmi_si i2c_i801 mei intel_pch_thermal >>> i2c_smbus ipmi_devintf ipmi_msghandler acpi_power_meter ip_tables xfs >>> libcrc32c sd_mod t10_pi crc64_rocksoft crc64 sg mgag200 i2c_algo_bit >>> drm_shmem_helper drm_kms_helper syscopyarea sysfillrect ice >>> kernel: sysimgblt fb_sys_fops ahci drm libahci crc32c_intel libata >>> megaraid_sas tg3 wmi dm_mirror dm_region_hash dm_log dm_mod fuse [last >>> unloaded: ip6_udp_tunnel] >>> kernel: CPU: 74 PID: 38510 Comm: rping Kdump: loaded Tainted: G S >>> W OE 5.18.0.RXE #14 >>> kernel: Hardware name: Dell Inc. PowerEdge R750/06V45N, BIOS 1.2.4 >>> 05/28/2021 >>> kernel: RIP: 0010:rxe_responder+0x1d67/0x1dd0 [rdma_rxe] >>> kernel: Code: 24 30 48 89 44 24 30 49 8b 86 88 00 00 00 48 89 44 24 >>> 38 48 8b 73 20 48 8b 43 18 ff d0 0f 1f 00 e9 10 e3 ff ff e8 e9 52 98 >>> ee <0f> 0b 45 8b 86 f0 00 00 00 48 8b 8c 24 e0 00 00 00 ba 01 03 00 00 >>> kernel: RSP: 0018:ff5f5b78c7624e70 EFLAGS: 00010246 >>> kernel: RAX: ff20346c70a1d700 RBX: ff20346c7127c040 RCX: >>> ff20346c70a1d700 >>> kernel: RDX: 0000000000000000 RSI: 0000000000000000 RDI: >>> ff20346c53194000 >>> kernel: RBP: 0000000000000040 R08: 2ebbb556a556fe7f R09: >>> 69de575d0320dc48 >>> kernel: R10: ff5f5b78c7624de0 R11: 00000000ee4984a4 R12: >>> ff20346c70a1d700 >>> kernel: R13: 0000000000000000 R14: ff20346ef0539000 R15: >>> ff20346c70a1c528 >>> kernel: FS: 00007ff34d49b740(0000) GS:ff20347b3fa80000(0000) >>> knlGS:0000000000000000 >>> kernel: CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 >>> kernel: CR2: 00007ff40be030c0 CR3: 00000003d0634005 CR4: >>> 0000000000771ee0 >>> kernel: DR0: 0000000000000000 DR1: 0000000000000000 DR2: >>> 0000000000000000 >>> kernel: DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: >>> 0000000000000400 >>> kernel: PKRU: 55555554 >>> kernel: Call Trace: >>> kernel: <IRQ> >>> kernel: ? __local_bh_enable_ip+0x9f/0xe0 >>> kernel: ? rxe_do_task+0x67/0xe0 [rdma_rxe] >>> kernel: ? __local_bh_enable_ip+0x77/0xe0 >>> kernel: rxe_do_task+0x71/0xe0 [rdma_rxe] >>> kernel: tasklet_action_common.isra.15+0xb8/0xf0 >>> kernel: __do_softirq+0xe4/0x48c >>> kernel: ? rxe_do_task+0x67/0xe0 [rdma_rxe] >>> kernel: do_softirq+0xb5/0x100 >>> kernel: </IRQ> >>> kernel: <TASK> >>> kernel: __local_bh_enable_ip+0xd0/0xe0 >>> kernel: rxe_do_task+0x67/0xe0 [rdma_rxe] >>> kernel: rxe_post_send+0x2ff/0x4c0 [rdma_rxe] >>> kernel: ? rdma_lookup_get_uobject+0x131/0x1e0 [ib_uverbs] >>> kernel: ib_uverbs_post_send+0x4d5/0x700 [ib_uverbs] >>> kernel: ib_uverbs_write+0x38f/0x5e0 [ib_uverbs] >>> kernel: ? find_held_lock+0x2d/0x90 >>> kernel: vfs_write+0xb8/0x370 >>> kernel: ksys_write+0xbb/0xd0 >>> kernel: ? syscall_trace_enter.isra.15+0x169/0x220 >>> kernel: do_syscall_64+0x37/0x80 >>> >>> Zhu Yanjun >>> >>> in rxe_resp.c >>>> This could be caused by a race between shutting down the qp and finishing up an RDMA read. >>>> The responder resources state machine is completely unprotected from >>>> simultaneous access by verbs code and bh code in rxe_resp.c. >>>> rxe_resp is a tasklet so all the accesses from there are serialized >>>> but if anyone makes a verbs call that touches the responder resources it could cause problems. The most likely (only?) place this could happen is qp shutdown. >>>> >>>> Bob >>>> >>>> >>>> >>>> diff --git a/drivers/infiniband/sw/rxe/rxe_mr.c >>>> b/drivers/infiniband/sw/rxe/rxe_mr.c >>>> >>>> index 60a31b718774..66184f5a4ddf 100644 >>>> >>>> --- a/drivers/infiniband/sw/rxe/rxe_mr.c >>>> >>>> +++ b/drivers/infiniband/sw/rxe/rxe_mr.c >>>> >>>> @@ -489,6 +489,7 @@ int copy_data( >>>> >>>> if (bytes > 0) { >>>> >>>> iova = sge->addr + offset; >>>> >>>> >>>> >>>> + WARN_ON(!mr); >>>> >>>> err = rxe_mr_copy(mr, iova, addr, bytes, >>>> dir); >>>> >>>> if (err) >>>> >>>> goto err2; >>>> >>>> diff --git a/drivers/infiniband/sw/rxe/rxe_resp.c >>>> b/drivers/infiniband/sw/rxe/rxe_resp.c >>>> >>>> index 1d95fab606da..6e3e86bdccd7 100644 >>>> >>>> --- a/drivers/infiniband/sw/rxe/rxe_resp.c >>>> >>>> +++ b/drivers/infiniband/sw/rxe/rxe_resp.c >>>> >>>> @@ -536,6 +536,7 @@ static enum resp_states write_data_in(struct >>>> rxe_qp *qp, >>>> >>>> int err; >>>> >>>> int data_len = payload_size(pkt); >>>> >>>> >>>> >>>> + WARN_ON(!qp->resp.mr); >>>> >>>> err = rxe_mr_copy(qp->resp.mr, qp->resp.va + >>>> qp->resp.offset, >>>> >>>> payload_addr(pkt), data_len, >>>> RXE_TO_MR_OBJ); >>>> >>>> if (err) { >>>> >>>> @@ -772,6 +773,7 @@ static enum resp_states read_reply(struct rxe_qp >>>> *qp, >>>> >>>> if (!skb) >>>> >>>> return RESPST_ERR_RNR; >>>> >>>> >>>> >>>> + WARN_ON(!mr); >>>> >>>> err = rxe_mr_copy(mr, res->read.va, payload_addr(&ack_pkt), >>>> >>>> payload, RXE_FROM_MR_OBJ); >>>> >>>> if (err) >>>> > > When you run rping are you going between two machines? It doesn't work in loopback as far as I can tell. Sorry. It is late to reply. With 2 machines or loopback, long time rping (about 10 minutes or so), the crash will occur. Zhu Yanjun > > Bob ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: null pointer in rxe_mr_copy() 2022-04-11 5:34 ` Zhu Yanjun 2022-04-11 16:25 ` Pearson, Robert B @ 2022-05-24 13:18 ` yangx.jy 2022-05-24 18:07 ` Bob Pearson 2022-05-24 23:56 ` Yanjun Zhu 1 sibling, 2 replies; 10+ messages in thread From: yangx.jy @ 2022-05-24 13:18 UTC (permalink / raw) To: Zhu Yanjun, Bob Pearson; +Cc: linux-rdma On 2022/4/11 13:34, Zhu Yanjun wrote: > 738 if (res->state == rdatm_res_state_new) { > 739 mr = qp->resp.mr; > <----It seems that mr is from here. > 740 qp->resp.mr = NULL; > 741 Hi Bob and Yanjun I wonder if the following patch has fixed the null pointer issue in rxe_mr_copy(). commit 570a4bf7440e9fb2a4164244a6bf60a46362b627 Author: Bob Pearson <rpearsonhpe@gmail.com> Date: Mon Apr 18 12:41:04 2022 -0500 RDMA/rxe: Recheck the MR in when generating a READ reply Best Regards, Xiao Yang ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: null pointer in rxe_mr_copy() 2022-05-24 13:18 ` yangx.jy @ 2022-05-24 18:07 ` Bob Pearson 2022-05-24 23:56 ` Yanjun Zhu 1 sibling, 0 replies; 10+ messages in thread From: Bob Pearson @ 2022-05-24 18:07 UTC (permalink / raw) To: yangx.jy, Zhu Yanjun; +Cc: linux-rdma On 5/24/22 08:18, yangx.jy@fujitsu.com wrote: > On 2022/4/11 13:34, Zhu Yanjun wrote: >> 738 if (res->state == rdatm_res_state_new) { >> 739 mr = qp->resp.mr; >> <----It seems that mr is from here. >> 740 qp->resp.mr = NULL; >> 741 > > Hi Bob and Yanjun > > I wonder if the following patch has fixed the null pointer issue in > rxe_mr_copy(). > > commit 570a4bf7440e9fb2a4164244a6bf60a46362b627 > Author: Bob Pearson <rpearsonhpe@gmail.com> > Date: Mon Apr 18 12:41:04 2022 -0500 > > RDMA/rxe: Recheck the MR in when generating a READ reply > > Best Regards, > Xiao Yang Correct. Bob ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: null pointer in rxe_mr_copy() 2022-05-24 13:18 ` yangx.jy 2022-05-24 18:07 ` Bob Pearson @ 2022-05-24 23:56 ` Yanjun Zhu 1 sibling, 0 replies; 10+ messages in thread From: Yanjun Zhu @ 2022-05-24 23:56 UTC (permalink / raw) To: yangx.jy, Zhu Yanjun, Bob Pearson; +Cc: linux-rdma 在 2022/5/24 21:18, yangx.jy@fujitsu.com 写道: > On 2022/4/11 13:34, Zhu Yanjun wrote: >> 738 if (res->state == rdatm_res_state_new) { >> 739 mr = qp->resp.mr; >> <----It seems that mr is from here. >> 740 qp->resp.mr = NULL; >> 741 > > Hi Bob and Yanjun > > I wonder if the following patch has fixed the null pointer issue in > rxe_mr_copy(). Yes. Zhu Yanjun > > commit 570a4bf7440e9fb2a4164244a6bf60a46362b627 > Author: Bob Pearson <rpearsonhpe@gmail.com> > Date: Mon Apr 18 12:41:04 2022 -0500 > > RDMA/rxe: Recheck the MR in when generating a READ reply > > Best Regards, > Xiao Yang ^ permalink raw reply [flat|nested] 10+ messages in thread
* Re: null pointer in rxe_mr_copy() 2022-04-11 3:34 null pointer in rxe_mr_copy() Bob Pearson 2022-04-11 5:14 ` Zhu Yanjun @ 2022-04-12 4:11 ` Bob Pearson 1 sibling, 0 replies; 10+ messages in thread From: Bob Pearson @ 2022-04-12 4:11 UTC (permalink / raw) To: Zhu Yanjun, linux-rdma On 4/10/22 22:34, Bob Pearson wrote: > Zhu, > > Since checking for mr == NULL in rxe_mr_copy fixes the problem you were seeing in rping. > Perhaps it would be a good idea to apply the following patch which would tell us which of > the three calls to rxe_mr_copy is failing. My suspicion is the one in read_reply() in rxe_resp.c > This could be caused by a race between shutting down the qp and finishing up an RDMA read. > The responder resources state machine is completely unprotected from simultaneous access by > verbs code and bh code in rxe_resp.c. rxe_resp is a tasklet so all the accesses from there are > serialized but if anyone makes a verbs call that touches the responder resources it could > cause problems. The most likely (only?) place this could happen is qp shutdown. I have reproduced a failure in rping on the v13 patch series. So never mind. It's something else. It runs for about a couple minutes on my system between a pair of VMs with rping -s or c -C 10000 -S 4096 -a 192.168.0.xx -d -V -p 1234 after a couple of minutes client hangs. Nothing in dmesg though. Happens right after an RDMA read that reports success on the server. Possibly it is at 10000 packets feels about the right time but job does not finish. > > Bob > > > > diff --git a/drivers/infiniband/sw/rxe/rxe_mr.c b/drivers/infiniband/sw/rxe/rxe_mr.c > > index 60a31b718774..66184f5a4ddf 100644 > > --- a/drivers/infiniband/sw/rxe/rxe_mr.c > > +++ b/drivers/infiniband/sw/rxe/rxe_mr.c > > @@ -489,6 +489,7 @@ int copy_data( > > if (bytes > 0) { > > iova = sge->addr + offset; > > > > + WARN_ON(!mr); > > err = rxe_mr_copy(mr, iova, addr, bytes, dir); > > if (err) > > goto err2; > > diff --git a/drivers/infiniband/sw/rxe/rxe_resp.c b/drivers/infiniband/sw/rxe/rxe_resp.c > > index 1d95fab606da..6e3e86bdccd7 100644 > > --- a/drivers/infiniband/sw/rxe/rxe_resp.c > > +++ b/drivers/infiniband/sw/rxe/rxe_resp.c > > @@ -536,6 +536,7 @@ static enum resp_states write_data_in(struct rxe_qp *qp, > > int err; > > int data_len = payload_size(pkt); > > > > + WARN_ON(!qp->resp.mr); > > err = rxe_mr_copy(qp->resp.mr, qp->resp.va + qp->resp.offset, > > payload_addr(pkt), data_len, RXE_TO_MR_OBJ); > > if (err) { > > @@ -772,6 +773,7 @@ static enum resp_states read_reply(struct rxe_qp *qp, > > if (!skb) > > return RESPST_ERR_RNR; > > > > + WARN_ON(!mr); > > err = rxe_mr_copy(mr, res->read.va, payload_addr(&ack_pkt), > > payload, RXE_FROM_MR_OBJ); > > if (err) > ^ permalink raw reply [flat|nested] 10+ messages in thread
end of thread, other threads:[~2022-05-24 23:56 UTC | newest] Thread overview: 10+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2022-04-11 3:34 null pointer in rxe_mr_copy() Bob Pearson 2022-04-11 5:14 ` Zhu Yanjun 2022-04-11 5:34 ` Zhu Yanjun 2022-04-11 16:25 ` Pearson, Robert B 2022-04-12 3:13 ` Bob Pearson 2022-04-12 10:04 ` Yanjun Zhu 2022-05-24 13:18 ` yangx.jy 2022-05-24 18:07 ` Bob Pearson 2022-05-24 23:56 ` Yanjun Zhu 2022-04-12 4:11 ` Bob Pearson
This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.