* BUG: scheduling while atomic when nvmet_rdma_queue_response fails in posting a request
@ 2021-05-30 7:33 Michal Kalderon
2021-06-08 16:50 ` Christoph Hellwig
2021-06-08 17:43 ` Sagi Grimberg
0 siblings, 2 replies; 8+ messages in thread
From: Michal Kalderon @ 2021-05-30 7:33 UTC (permalink / raw)
To: Christoph Hellwig, sagi; +Cc: linux-nvme, Shai Malin, Ariel Elior
Hi Christoph, Sagi,
We're testing some device error recovery scenarios and hit the following BUG, stack trace below.
In the error scenario, nvmet_rdma_queue_response receives an error from the device when trying to post a wr,
this leads to nvmet_rdma_release_rsp being called from softirq eventually
reaching the blk_mq_delay_run_hw_queue which tries to schedule in softirq. (full stack below)
could you please advise what the correct solution should be in this case ?
thanks,
Michal
[ 8790.082863] nvmet_rdma: post_recv cmd failed
[ 8790.083484] nvmet_rdma: sending cmd response failed
[ 8790.084131] ------------[ cut here ]------------
[ 8790.084140] WARNING: CPU: 7 PID: 46 at block/blk-mq.c:1422 __blk_mq_run_hw_queue+0xb7/0x100
[ 8790.084619] Modules linked in: null_blk nvmet_rdma nvmet nvme_rdma nvme_fabrics nvme_core netconsole qedr(OE) qede(OE) qed(OE) rpcsec_gss_krb5 auth_rpcgss nfsv4 dns_resolver nfs lockd grace fscache xt_CHECKSUM nft_chain_nat xt_MASQUERADE nf_nat xt_conntrack nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 libcrc32c nft_counter nft_compat tun bridge stp llc nf_tables nfnetlink ib_isert iscsi_target_mod ib_srpt target_core_mod ib_srp scsi_transport_srp ib_ipoib ib_umad rpcrdma rdma_ucm ib_iser rdma_cm iw_cm intel_rapl_msr intel_rapl_common ib_cm sb_edac libiscsi scsi_transport_iscsi kvm_intel kvm irqbypass crct10dif_pclmul crc32_pclmul ghash_clmulni_intel sunrpc rapl ib_uverbs ib_core cirrus drm_kms_helper drm virtio_balloon i2c_piix4 pcspkr crc32c_intel virtio_net serio_raw net_failover failover floppy crc8 ata_generic pata_acpi qemu_fw_cfg [last unloaded: qedr]
[ 8790.084748] CPU: 7 PID: 46 Comm: ksoftirqd/7 Tainted: G OE 5.8.10 #1
[ 8790.084749] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS ?-20180724_192412-buildhw-07.phx2.fedoraproject.org-1.fc29 04/01/2014
[ 8790.084752] RIP: 0010:__blk_mq_run_hw_queue+0xb7/0x100
[ 8790.084753] Code: 00 48 89 ef e8 ea 34 c8 ff 48 89 df 41 89 c4 e8 1f 7f 00 00 f6 83 a8 00 00 00 20 74 b1 41 f7 c4 fe ff ff ff 74 b7 0f 0b eb b3 <0f> 0b eb 86 48 83 bf 98 00 00 00 00 48 c7 c0 df 81 3f 82 48 c7 c2
[ 8790.084754] RSP: 0018:ffffc9000020ba60 EFLAGS: 00010206
[ 8790.084755] RAX: 0000000000000100 RBX: ffff88809fe8c400 RCX: 00000000ffffffff
[ 8790.084756] RDX: 0000000000000000 RSI: 0000000000000000 RDI: ffff88809fe8c400
[ 8790.084756] RBP: ffff888137b81a50 R08: ffffffffffffffff R09: 0000000000000020
[ 8790.084757] R10: 0000000000000001 R11: ffff8881365d4968 R12: 0000000000000000
[ 8790.084758] R13: ffff888137b81a40 R14: ffff88811e2b9e80 R15: ffff8880b3d964f0
[ 8790.084759] FS: 0000000000000000(0000) GS:ffff88813bbc0000(0000) knlGS:0000000000000000
[ 8790.084759] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 8790.084760] CR2: 000055ca53900da8 CR3: 000000012b83e006 CR4: 0000000000360ee0
[ 8790.084763] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[ 8790.084763] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
[ 8790.084764] Call Trace:
[ 8790.084767] __blk_mq_delay_run_hw_queue+0x140/0x160
[ 8790.084768] blk_mq_get_tag+0x1d1/0x270
[ 8790.084771] ? finish_wait+0x80/0x80
[ 8790.084773] __blk_mq_alloc_request+0xb1/0x100
[ 8790.084774] blk_mq_make_request+0x144/0x5d0
[ 8790.084778] generic_make_request+0x2db/0x340
[ 8790.084779] ? bvec_alloc+0x82/0xe0
[ 8790.084781] submit_bio+0x43/0x160
[ 8790.084781] ? bio_add_page+0x39/0x90
[ 8790.084794] nvmet_bdev_execute_rw+0x28c/0x360 [nvmet]
[ 8790.084800] nvmet_rdma_execute_command+0x72/0x110 [nvmet_rdma]
[ 8790.084802] nvmet_rdma_release_rsp+0xc1/0x1e0 [nvmet_rdma]
[ 8790.084804] nvmet_rdma_queue_response.cold.63+0x14/0x19 [nvmet_rdma]
[ 8790.084806] nvmet_req_complete+0x11/0x40 [nvmet]
[ 8790.084809] nvmet_bio_done+0x27/0x100 [nvmet]
[ 8790.084811] blk_update_request+0x23e/0x3b0
[ 8790.084812] blk_mq_end_request+0x1a/0x120
[ 8790.084814] blk_done_softirq+0xa1/0xd0
[ 8790.084818] __do_softirq+0xe4/0x2f8
[ 8790.084821] ? sort_range+0x20/0x20
[ 8790.084824] run_ksoftirqd+0x26/0x40
[ 8790.084825] smpboot_thread_fn+0xc5/0x160
[ 8790.084827] kthread+0x116/0x130
[ 8790.084828] ? kthread_park+0x80/0x80
[ 8790.084832] ret_from_fork+0x22/0x30
[ 8790.084833] ---[ end trace 16ec813ee3f82b56 ]---
[ 8790.085314] BUG: scheduling while atomic: ksoftirqd/7/46/0x00000100
_______________________________________________
Linux-nvme mailing list
Linux-nvme@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-nvme
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: BUG: scheduling while atomic when nvmet_rdma_queue_response fails in posting a request
2021-05-30 7:33 BUG: scheduling while atomic when nvmet_rdma_queue_response fails in posting a request Michal Kalderon
@ 2021-06-08 16:50 ` Christoph Hellwig
2021-06-08 17:43 ` Sagi Grimberg
1 sibling, 0 replies; 8+ messages in thread
From: Christoph Hellwig @ 2021-06-08 16:50 UTC (permalink / raw)
To: Michal Kalderon
Cc: Christoph Hellwig, sagi, linux-nvme, Shai Malin, Ariel Elior
What kernel version is this?
On Sun, May 30, 2021 at 07:33:18AM +0000, Michal Kalderon wrote:
>
> this leads to nvmet_rdma_release_rsp being called from softirq eventually
> reaching the blk_mq_delay_run_hw_queue which tries to schedule in softirq. (full stack below)
>
> could you please advise what the correct solution should be in this case ?
>
> thanks,
> Michal
>
> [ 8790.082863] nvmet_rdma: post_recv cmd failed
> [ 8790.083484] nvmet_rdma: sending cmd response failed
> [ 8790.084131] ------------[ cut here ]------------
> [ 8790.084140] WARNING: CPU: 7 PID: 46 at block/blk-mq.c:1422 __blk_mq_run_hw_queue+0xb7/0x100
> [ 8790.084619] Modules linked in: null_blk nvmet_rdma nvmet nvme_rdma nvme_fabrics nvme_core netconsole qedr(OE) qede(OE) qed(OE) rpcsec_gss_krb5 auth_rpcgss nfsv4 dns_resolver nfs lockd grace fscache xt_CHECKSUM nft_chain_nat xt_MASQUERADE nf_nat xt_conntrack nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 libcrc32c nft_counter nft_compat tun bridge stp llc nf_tables nfnetlink ib_isert iscsi_target_mod ib_srpt target_core_mod ib_srp scsi_transport_srp ib_ipoib ib_umad rpcrdma rdma_ucm ib_iser rdma_cm iw_cm intel_rapl_msr intel_rapl_common ib_cm sb_edac libiscsi scsi_transport_iscsi kvm_intel kvm irqbypass crct10dif_pclmul crc32_pclmul ghash_clmulni_intel sunrpc rapl ib_uverbs ib_core cirrus drm_kms_helper drm virtio_balloon i2c_piix4 pcspkr crc32c_intel virtio_net serio_raw net_failover failover floppy crc8 ata_generic pata_acpi qemu_fw_cfg [last unloaded: qedr]
> [ 8790.084748] CPU: 7 PID: 46 Comm: ksoftirqd/7 Tainted: G OE 5.8.10 #1
> [ 8790.084749] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS ?-20180724_192412-buildhw-07.phx2.fedoraproject.org-1.fc29 04/01/2014
> [ 8790.084752] RIP: 0010:__blk_mq_run_hw_queue+0xb7/0x100
> [ 8790.084753] Code: 00 48 89 ef e8 ea 34 c8 ff 48 89 df 41 89 c4 e8 1f 7f 00 00 f6 83 a8 00 00 00 20 74 b1 41 f7 c4 fe ff ff ff 74 b7 0f 0b eb b3 <0f> 0b eb 86 48 83 bf 98 00 00 00 00 48 c7 c0 df 81 3f 82 48 c7 c2
> [ 8790.084754] RSP: 0018:ffffc9000020ba60 EFLAGS: 00010206
> [ 8790.084755] RAX: 0000000000000100 RBX: ffff88809fe8c400 RCX: 00000000ffffffff
> [ 8790.084756] RDX: 0000000000000000 RSI: 0000000000000000 RDI: ffff88809fe8c400
> [ 8790.084756] RBP: ffff888137b81a50 R08: ffffffffffffffff R09: 0000000000000020
> [ 8790.084757] R10: 0000000000000001 R11: ffff8881365d4968 R12: 0000000000000000
> [ 8790.084758] R13: ffff888137b81a40 R14: ffff88811e2b9e80 R15: ffff8880b3d964f0
> [ 8790.084759] FS: 0000000000000000(0000) GS:ffff88813bbc0000(0000) knlGS:0000000000000000
> [ 8790.084759] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> [ 8790.084760] CR2: 000055ca53900da8 CR3: 000000012b83e006 CR4: 0000000000360ee0
> [ 8790.084763] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
> [ 8790.084763] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
> [ 8790.084764] Call Trace:
> [ 8790.084767] __blk_mq_delay_run_hw_queue+0x140/0x160
> [ 8790.084768] blk_mq_get_tag+0x1d1/0x270
> [ 8790.084771] ? finish_wait+0x80/0x80
> [ 8790.084773] __blk_mq_alloc_request+0xb1/0x100
> [ 8790.084774] blk_mq_make_request+0x144/0x5d0
> [ 8790.084778] generic_make_request+0x2db/0x340
> [ 8790.084779] ? bvec_alloc+0x82/0xe0
> [ 8790.084781] submit_bio+0x43/0x160
> [ 8790.084781] ? bio_add_page+0x39/0x90
> [ 8790.084794] nvmet_bdev_execute_rw+0x28c/0x360 [nvmet]
> [ 8790.084800] nvmet_rdma_execute_command+0x72/0x110 [nvmet_rdma]
> [ 8790.084802] nvmet_rdma_release_rsp+0xc1/0x1e0 [nvmet_rdma]
> [ 8790.084804] nvmet_rdma_queue_response.cold.63+0x14/0x19 [nvmet_rdma]
> [ 8790.084806] nvmet_req_complete+0x11/0x40 [nvmet]
> [ 8790.084809] nvmet_bio_done+0x27/0x100 [nvmet]
> [ 8790.084811] blk_update_request+0x23e/0x3b0
> [ 8790.084812] blk_mq_end_request+0x1a/0x120
> [ 8790.084814] blk_done_softirq+0xa1/0xd0
> [ 8790.084818] __do_softirq+0xe4/0x2f8
> [ 8790.084821] ? sort_range+0x20/0x20
> [ 8790.084824] run_ksoftirqd+0x26/0x40
> [ 8790.084825] smpboot_thread_fn+0xc5/0x160
> [ 8790.084827] kthread+0x116/0x130
> [ 8790.084828] ? kthread_park+0x80/0x80
> [ 8790.084832] ret_from_fork+0x22/0x30
> [ 8790.084833] ---[ end trace 16ec813ee3f82b56 ]---
> [ 8790.085314] BUG: scheduling while atomic: ksoftirqd/7/46/0x00000100
---end quoted text---
_______________________________________________
Linux-nvme mailing list
Linux-nvme@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-nvme
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: BUG: scheduling while atomic when nvmet_rdma_queue_response fails in posting a request
2021-05-30 7:33 BUG: scheduling while atomic when nvmet_rdma_queue_response fails in posting a request Michal Kalderon
2021-06-08 16:50 ` Christoph Hellwig
@ 2021-06-08 17:43 ` Sagi Grimberg
2021-06-08 18:41 ` Keith Busch
1 sibling, 1 reply; 8+ messages in thread
From: Sagi Grimberg @ 2021-06-08 17:43 UTC (permalink / raw)
To: Michal Kalderon, Christoph Hellwig; +Cc: linux-nvme, Shai Malin, Ariel Elior
> Hi Christoph, Sagi,
>
> We're testing some device error recovery scenarios and hit the following BUG, stack trace below.
> In the error scenario, nvmet_rdma_queue_response receives an error from the device when trying to post a wr,
>
> this leads to nvmet_rdma_release_rsp being called from softirq eventually
> reaching the blk_mq_delay_run_hw_queue which tries to schedule in softirq. (full stack below)
>
> could you please advise what the correct solution should be in this case ?
Hey Michal,
I agree this can happen and requires correction. Does the below resolve
the issue?
--
diff --git a/drivers/nvme/target/rdma.c b/drivers/nvme/target/rdma.c
index 7d607f435e36..6d2eea322779 100644
--- a/drivers/nvme/target/rdma.c
+++ b/drivers/nvme/target/rdma.c
@@ -16,6 +16,7 @@
#include <linux/wait.h>
#include <linux/inet.h>
#include <asm/unaligned.h>
+#include <linux/async.h>
#include <rdma/ib_verbs.h>
#include <rdma/rdma_cm.h>
@@ -712,6 +713,12 @@ static void nvmet_rdma_send_done(struct ib_cq *cq,
struct ib_wc *wc)
}
}
+static void nvmet_rdma_async_release_rsp(void *data, async_cookie_t cookie)
+{
+ struct nvmet_rdma_rsp *rsp = data;
+ nvmet_rdma_release_rsp(rsp);
+}
+
static void nvmet_rdma_queue_response(struct nvmet_req *req)
{
struct nvmet_rdma_rsp *rsp =
@@ -745,7 +752,12 @@ static void nvmet_rdma_queue_response(struct
nvmet_req *req)
if (unlikely(ib_post_send(cm_id->qp, first_wr, NULL))) {
pr_err("sending cmd response failed\n");
- nvmet_rdma_release_rsp(rsp);
+ /*
+ * We might be in atomic context, hence release
+ * the rsp in async context in case we need to
+ * process the wr_wait_list.
+ */
+ async_schedule(nvmet_rdma_async_release_rsp, rsp);
}
}
--
>
> thanks,
> Michal
>
> [ 8790.082863] nvmet_rdma: post_recv cmd failed
> [ 8790.083484] nvmet_rdma: sending cmd response failed
> [ 8790.084131] ------------[ cut here ]------------
> [ 8790.084140] WARNING: CPU: 7 PID: 46 at block/blk-mq.c:1422 __blk_mq_run_hw_queue+0xb7/0x100
> [ 8790.084619] Modules linked in: null_blk nvmet_rdma nvmet nvme_rdma nvme_fabrics nvme_core netconsole qedr(OE) qede(OE) qed(OE) rpcsec_gss_krb5 auth_rpcgss nfsv4 dns_resolver nfs lockd grace fscache xt_CHECKSUM nft_chain_nat xt_MASQUERADE nf_nat xt_conntrack nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 libcrc32c nft_counter nft_compat tun bridge stp llc nf_tables nfnetlink ib_isert iscsi_target_mod ib_srpt target_core_mod ib_srp scsi_transport_srp ib_ipoib ib_umad rpcrdma rdma_ucm ib_iser rdma_cm iw_cm intel_rapl_msr intel_rapl_common ib_cm sb_edac libiscsi scsi_transport_iscsi kvm_intel kvm irqbypass crct10dif_pclmul crc32_pclmul ghash_clmulni_intel sunrpc rapl ib_uverbs ib_core cirrus drm_kms_helper drm virtio_balloon i2c_piix4 pcspkr crc32c_intel virtio_net serio_raw net_failover failover floppy crc8 ata_generic pata_acpi qemu_fw_cfg [last unloaded: qedr]
> [ 8790.084748] CPU: 7 PID: 46 Comm: ksoftirqd/7 Tainted: G OE 5.8.10 #1
> [ 8790.084749] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS ?-20180724_192412-buildhw-07.phx2.fedoraproject.org-1.fc29 04/01/2014
> [ 8790.084752] RIP: 0010:__blk_mq_run_hw_queue+0xb7/0x100
> [ 8790.084753] Code: 00 48 89 ef e8 ea 34 c8 ff 48 89 df 41 89 c4 e8 1f 7f 00 00 f6 83 a8 00 00 00 20 74 b1 41 f7 c4 fe ff ff ff 74 b7 0f 0b eb b3 <0f> 0b eb 86 48 83 bf 98 00 00 00 00 48 c7 c0 df 81 3f 82 48 c7 c2
> [ 8790.084754] RSP: 0018:ffffc9000020ba60 EFLAGS: 00010206
> [ 8790.084755] RAX: 0000000000000100 RBX: ffff88809fe8c400 RCX: 00000000ffffffff
> [ 8790.084756] RDX: 0000000000000000 RSI: 0000000000000000 RDI: ffff88809fe8c400
> [ 8790.084756] RBP: ffff888137b81a50 R08: ffffffffffffffff R09: 0000000000000020
> [ 8790.084757] R10: 0000000000000001 R11: ffff8881365d4968 R12: 0000000000000000
> [ 8790.084758] R13: ffff888137b81a40 R14: ffff88811e2b9e80 R15: ffff8880b3d964f0
> [ 8790.084759] FS: 0000000000000000(0000) GS:ffff88813bbc0000(0000) knlGS:0000000000000000
> [ 8790.084759] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> [ 8790.084760] CR2: 000055ca53900da8 CR3: 000000012b83e006 CR4: 0000000000360ee0
> [ 8790.084763] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
> [ 8790.084763] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
> [ 8790.084764] Call Trace:
> [ 8790.084767] __blk_mq_delay_run_hw_queue+0x140/0x160
> [ 8790.084768] blk_mq_get_tag+0x1d1/0x270
> [ 8790.084771] ? finish_wait+0x80/0x80
> [ 8790.084773] __blk_mq_alloc_request+0xb1/0x100
> [ 8790.084774] blk_mq_make_request+0x144/0x5d0
> [ 8790.084778] generic_make_request+0x2db/0x340
> [ 8790.084779] ? bvec_alloc+0x82/0xe0
> [ 8790.084781] submit_bio+0x43/0x160
> [ 8790.084781] ? bio_add_page+0x39/0x90
> [ 8790.084794] nvmet_bdev_execute_rw+0x28c/0x360 [nvmet]
> [ 8790.084800] nvmet_rdma_execute_command+0x72/0x110 [nvmet_rdma]
> [ 8790.084802] nvmet_rdma_release_rsp+0xc1/0x1e0 [nvmet_rdma]
> [ 8790.084804] nvmet_rdma_queue_response.cold.63+0x14/0x19 [nvmet_rdma]
> [ 8790.084806] nvmet_req_complete+0x11/0x40 [nvmet]
> [ 8790.084809] nvmet_bio_done+0x27/0x100 [nvmet]
> [ 8790.084811] blk_update_request+0x23e/0x3b0
> [ 8790.084812] blk_mq_end_request+0x1a/0x120
> [ 8790.084814] blk_done_softirq+0xa1/0xd0
> [ 8790.084818] __do_softirq+0xe4/0x2f8
> [ 8790.084821] ? sort_range+0x20/0x20
> [ 8790.084824] run_ksoftirqd+0x26/0x40
> [ 8790.084825] smpboot_thread_fn+0xc5/0x160
> [ 8790.084827] kthread+0x116/0x130
> [ 8790.084828] ? kthread_park+0x80/0x80
> [ 8790.084832] ret_from_fork+0x22/0x30
> [ 8790.084833] ---[ end trace 16ec813ee3f82b56 ]---
> [ 8790.085314] BUG: scheduling while atomic: ksoftirqd/7/46/0x00000100
>
_______________________________________________
Linux-nvme mailing list
Linux-nvme@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-nvme
^ permalink raw reply related [flat|nested] 8+ messages in thread
* Re: BUG: scheduling while atomic when nvmet_rdma_queue_response fails in posting a request
2021-06-08 17:43 ` Sagi Grimberg
@ 2021-06-08 18:41 ` Keith Busch
2021-06-09 0:03 ` Sagi Grimberg
0 siblings, 1 reply; 8+ messages in thread
From: Keith Busch @ 2021-06-08 18:41 UTC (permalink / raw)
To: Sagi Grimberg
Cc: Michal Kalderon, Christoph Hellwig, linux-nvme, Shai Malin, Ariel Elior
On Tue, Jun 08, 2021 at 10:43:45AM -0700, Sagi Grimberg wrote:
> diff --git a/drivers/nvme/target/rdma.c b/drivers/nvme/target/rdma.c
> index 7d607f435e36..6d2eea322779 100644
> --- a/drivers/nvme/target/rdma.c
> +++ b/drivers/nvme/target/rdma.c
> @@ -16,6 +16,7 @@
> #include <linux/wait.h>
> #include <linux/inet.h>
> #include <asm/unaligned.h>
> +#include <linux/async.h>
>
> #include <rdma/ib_verbs.h>
> #include <rdma/rdma_cm.h>
> @@ -712,6 +713,12 @@ static void nvmet_rdma_send_done(struct ib_cq *cq,
> struct ib_wc *wc)
> }
> }
>
> +static void nvmet_rdma_async_release_rsp(void *data, async_cookie_t cookie)
> +{
> + struct nvmet_rdma_rsp *rsp = data;
> + nvmet_rdma_release_rsp(rsp);
> +}
> +
> static void nvmet_rdma_queue_response(struct nvmet_req *req)
> {
> struct nvmet_rdma_rsp *rsp =
> @@ -745,7 +752,12 @@ static void nvmet_rdma_queue_response(struct nvmet_req
> *req)
>
> if (unlikely(ib_post_send(cm_id->qp, first_wr, NULL))) {
> pr_err("sending cmd response failed\n");
> - nvmet_rdma_release_rsp(rsp);
> + /*
> + * We might be in atomic context, hence release
> + * the rsp in async context in case we need to
> + * process the wr_wait_list.
> + */
> + async_schedule(nvmet_rdma_async_release_rsp, rsp);
> }
> }
Just FYI, async_schedule() has conditions where it may execute your
callback synchronously. Your suggestion is probably fine for testing,
but it sounds like you require something that can guarantee a non-atomic
context for nvmet_rdma_release_rsp().
_______________________________________________
Linux-nvme mailing list
Linux-nvme@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-nvme
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: BUG: scheduling while atomic when nvmet_rdma_queue_response fails in posting a request
2021-06-08 18:41 ` Keith Busch
@ 2021-06-09 0:03 ` Sagi Grimberg
2021-06-14 14:44 ` [EXT] " Michal Kalderon
0 siblings, 1 reply; 8+ messages in thread
From: Sagi Grimberg @ 2021-06-09 0:03 UTC (permalink / raw)
To: Keith Busch
Cc: Michal Kalderon, Christoph Hellwig, linux-nvme, Shai Malin, Ariel Elior
>> diff --git a/drivers/nvme/target/rdma.c b/drivers/nvme/target/rdma.c
>> index 7d607f435e36..6d2eea322779 100644
>> --- a/drivers/nvme/target/rdma.c
>> +++ b/drivers/nvme/target/rdma.c
>> @@ -16,6 +16,7 @@
>> #include <linux/wait.h>
>> #include <linux/inet.h>
>> #include <asm/unaligned.h>
>> +#include <linux/async.h>
>>
>> #include <rdma/ib_verbs.h>
>> #include <rdma/rdma_cm.h>
>> @@ -712,6 +713,12 @@ static void nvmet_rdma_send_done(struct ib_cq *cq,
>> struct ib_wc *wc)
>> }
>> }
>>
>> +static void nvmet_rdma_async_release_rsp(void *data, async_cookie_t cookie)
>> +{
>> + struct nvmet_rdma_rsp *rsp = data;
>> + nvmet_rdma_release_rsp(rsp);
>> +}
>> +
>> static void nvmet_rdma_queue_response(struct nvmet_req *req)
>> {
>> struct nvmet_rdma_rsp *rsp =
>> @@ -745,7 +752,12 @@ static void nvmet_rdma_queue_response(struct nvmet_req
>> *req)
>>
>> if (unlikely(ib_post_send(cm_id->qp, first_wr, NULL))) {
>> pr_err("sending cmd response failed\n");
>> - nvmet_rdma_release_rsp(rsp);
>> + /*
>> + * We might be in atomic context, hence release
>> + * the rsp in async context in case we need to
>> + * process the wr_wait_list.
>> + */
>> + async_schedule(nvmet_rdma_async_release_rsp, rsp);
>> }
>> }
>
> Just FYI, async_schedule() has conditions where it may execute your
> callback synchronously. Your suggestion is probably fine for testing,
> but it sounds like you require something that can guarantee a non-atomic
> context for nvmet_rdma_release_rsp().
OK, it seems that the issue is that we are submitting I/O in atomic
context. This should be more appropriate...
--
diff --git a/drivers/nvme/target/rdma.c b/drivers/nvme/target/rdma.c
index 7d607f435e36..16f2f5a84ae7 100644
--- a/drivers/nvme/target/rdma.c
+++ b/drivers/nvme/target/rdma.c
@@ -102,6 +102,7 @@ struct nvmet_rdma_queue {
struct work_struct release_work;
struct list_head rsp_wait_list;
+ struct work_struct wr_wait_work;
struct list_head rsp_wr_wait_list;
spinlock_t rsp_wr_wait_lock;
@@ -517,8 +518,10 @@ static int nvmet_rdma_post_recv(struct
nvmet_rdma_device *ndev,
return ret;
}
-static void nvmet_rdma_process_wr_wait_list(struct nvmet_rdma_queue *queue)
+static void nvmet_rdma_process_wr_wait_list(struct work_struct *w)
{
+ struct nvmet_rdma_queue *queue =
+ container_of(w, struct nvmet_rdma_queue, wr_wait_work);
spin_lock(&queue->rsp_wr_wait_lock);
while (!list_empty(&queue->rsp_wr_wait_list)) {
struct nvmet_rdma_rsp *rsp;
@@ -677,7 +680,7 @@ static void nvmet_rdma_release_rsp(struct
nvmet_rdma_rsp *rsp)
nvmet_req_free_sgls(&rsp->req);
if (unlikely(!list_empty_careful(&queue->rsp_wr_wait_list)))
- nvmet_rdma_process_wr_wait_list(queue);
+ schedule_work(&queue->wr_wait_work);
nvmet_rdma_put_rsp(rsp);
}
@@ -1446,6 +1449,7 @@ nvmet_rdma_alloc_queue(struct nvmet_rdma_device *ndev,
* inside a CM callback would trigger a deadlock. (great API
design..)
*/
INIT_WORK(&queue->release_work, nvmet_rdma_release_queue_work);
+ INIT_WORK(&queue->wr_wait_work, nvmet_rdma_process_wr_wait_list);
queue->dev = ndev;
queue->cm_id = cm_id;
queue->port = port->nport;
--
_______________________________________________
Linux-nvme mailing list
Linux-nvme@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-nvme
^ permalink raw reply related [flat|nested] 8+ messages in thread
* RE: [EXT] Re: BUG: scheduling while atomic when nvmet_rdma_queue_response fails in posting a request
2021-06-09 0:03 ` Sagi Grimberg
@ 2021-06-14 14:44 ` Michal Kalderon
2021-06-14 16:44 ` Sagi Grimberg
0 siblings, 1 reply; 8+ messages in thread
From: Michal Kalderon @ 2021-06-14 14:44 UTC (permalink / raw)
To: Sagi Grimberg, Keith Busch
Cc: Christoph Hellwig, linux-nvme, Shai Malin, Ariel Elior
> From: Sagi Grimberg <sagi@grimberg.me>
> Sent: Wednesday, June 9, 2021 3:04 AM
>
> ----------------------------------------------------------------------
>
> >> diff --git a/drivers/nvme/target/rdma.c b/drivers/nvme/target/rdma.c
> >> index 7d607f435e36..6d2eea322779 100644
> >> --- a/drivers/nvme/target/rdma.c
> >> +++ b/drivers/nvme/target/rdma.c
> >> @@ -16,6 +16,7 @@
> >> #include <linux/wait.h>
> >> #include <linux/inet.h>
> >> #include <asm/unaligned.h>
> >> +#include <linux/async.h>
> >>
> >> #include <rdma/ib_verbs.h>
> >> #include <rdma/rdma_cm.h>
> >> @@ -712,6 +713,12 @@ static void nvmet_rdma_send_done(struct ib_cq
> *cq,
> >> struct ib_wc *wc)
> >> }
> >> }
> >>
> >> +static void nvmet_rdma_async_release_rsp(void *data, async_cookie_t
> cookie)
> >> +{
> >> + struct nvmet_rdma_rsp *rsp = data;
> >> + nvmet_rdma_release_rsp(rsp);
> >> +}
> >> +
> >> static void nvmet_rdma_queue_response(struct nvmet_req *req)
> >> {
> >> struct nvmet_rdma_rsp *rsp =
> >> @@ -745,7 +752,12 @@ static void nvmet_rdma_queue_response(struct
> nvmet_req
> >> *req)
> >>
> >> if (unlikely(ib_post_send(cm_id->qp, first_wr, NULL))) {
> >> pr_err("sending cmd response failed\n");
> >> - nvmet_rdma_release_rsp(rsp);
> >> + /*
> >> + * We might be in atomic context, hence release
> >> + * the rsp in async context in case we need to
> >> + * process the wr_wait_list.
> >> + */
> >> + async_schedule(nvmet_rdma_async_release_rsp, rsp);
> >> }
> >> }
> >
> > Just FYI, async_schedule() has conditions where it may execute your
> > callback synchronously. Your suggestion is probably fine for testing,
> > but it sounds like you require something that can guarantee a non-atomic
> > context for nvmet_rdma_release_rsp().
>
> OK, it seems that the issue is that we are submitting I/O in atomic
> context. This should be more appropriate...
Thanks Sagi, this seems to work. I'm still hitting some other issues where in some cases reconnect fails, but I'm
Collecting more info.
>
> --
> diff --git a/drivers/nvme/target/rdma.c b/drivers/nvme/target/rdma.c
> index 7d607f435e36..16f2f5a84ae7 100644
> --- a/drivers/nvme/target/rdma.c
> +++ b/drivers/nvme/target/rdma.c
> @@ -102,6 +102,7 @@ struct nvmet_rdma_queue {
>
> struct work_struct release_work;
> struct list_head rsp_wait_list;
> + struct work_struct wr_wait_work;
> struct list_head rsp_wr_wait_list;
> spinlock_t rsp_wr_wait_lock;
>
> @@ -517,8 +518,10 @@ static int nvmet_rdma_post_recv(struct
> nvmet_rdma_device *ndev,
> return ret;
> }
>
> -static void nvmet_rdma_process_wr_wait_list(struct nvmet_rdma_queue
> *queue)
> +static void nvmet_rdma_process_wr_wait_list(struct work_struct *w)
> {
> + struct nvmet_rdma_queue *queue =
> + container_of(w, struct nvmet_rdma_queue, wr_wait_work);
> spin_lock(&queue->rsp_wr_wait_lock);
> while (!list_empty(&queue->rsp_wr_wait_list)) {
> struct nvmet_rdma_rsp *rsp;
> @@ -677,7 +680,7 @@ static void nvmet_rdma_release_rsp(struct
> nvmet_rdma_rsp *rsp)
> nvmet_req_free_sgls(&rsp->req);
>
> if (unlikely(!list_empty_careful(&queue->rsp_wr_wait_list)))
> - nvmet_rdma_process_wr_wait_list(queue);
> + schedule_work(&queue->wr_wait_work);
>
> nvmet_rdma_put_rsp(rsp);
> }
> @@ -1446,6 +1449,7 @@ nvmet_rdma_alloc_queue(struct
> nvmet_rdma_device *ndev,
> * inside a CM callback would trigger a deadlock. (great API
> design..)
> */
> INIT_WORK(&queue->release_work,
> nvmet_rdma_release_queue_work);
> + INIT_WORK(&queue->wr_wait_work,
> nvmet_rdma_process_wr_wait_list);
> queue->dev = ndev;
> queue->cm_id = cm_id;
> queue->port = port->nport;
> --
Thanks,
Tested-by: Michal Kalderon <michal.kalderon@marvell.com>
_______________________________________________
Linux-nvme mailing list
Linux-nvme@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-nvme
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [EXT] Re: BUG: scheduling while atomic when nvmet_rdma_queue_response fails in posting a request
2021-06-14 14:44 ` [EXT] " Michal Kalderon
@ 2021-06-14 16:44 ` Sagi Grimberg
2021-06-14 18:14 ` Michal Kalderon
0 siblings, 1 reply; 8+ messages in thread
From: Sagi Grimberg @ 2021-06-14 16:44 UTC (permalink / raw)
To: Michal Kalderon, Keith Busch
Cc: Christoph Hellwig, linux-nvme, Shai Malin, Ariel Elior
>> OK, it seems that the issue is that we are submitting I/O in atomic
>> context. This should be more appropriate...
>
> Thanks Sagi, this seems to work. I'm still hitting some other issues where in some cases reconnect fails, but I'm
> Collecting more info.
Same type of failures?
_______________________________________________
Linux-nvme mailing list
Linux-nvme@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-nvme
^ permalink raw reply [flat|nested] 8+ messages in thread
* RE: [EXT] Re: BUG: scheduling while atomic when nvmet_rdma_queue_response fails in posting a request
2021-06-14 16:44 ` Sagi Grimberg
@ 2021-06-14 18:14 ` Michal Kalderon
0 siblings, 0 replies; 8+ messages in thread
From: Michal Kalderon @ 2021-06-14 18:14 UTC (permalink / raw)
To: Sagi Grimberg, Keith Busch
Cc: Christoph Hellwig, linux-nvme, Shai Malin, Ariel Elior
> From: Sagi Grimberg <sagi@grimberg.me>
> Sent: Monday, June 14, 2021 7:45 PM
>
>
> >> OK, it seems that the issue is that we are submitting I/O in atomic
> >> context. This should be more appropriate...
> >
> > Thanks Sagi, this seems to work. I'm still hitting some other issues where in
> some cases reconnect fails, but I'm
> > Collecting more info.
>
> Same type of failures?
No, something else.
After recovery completes, I'm getting the following errors on initiator side without any messages on target:
[14678.618025] nvme nvme2: Connect rejected: status -104 (reset by remote host).
[14678.619350] nvme nvme2: rdma connection establishment failed (-104)
[14678.622274] nvme nvme2: Failed reconnect attempt 6
[14678.623623] nvme nvme2: Reconnecting in 10 seconds...
[14751.304247] nvme nvme2: I/O 0 QID 0 timeout
[14751.305749] nvme nvme2: Connect command failed, error wo/DNR bit: 881
[14751.307240] nvme nvme2: failed to connect queue: 0 ret=881
[14751.310497] nvme nvme2: Failed reconnect attempt 7
[14751.312174] nvme nvme2: Reconnecting in 10 seconds...
[14825.032645] nvme nvme2: I/O 1 QID 0 timeout
_______________________________________________
Linux-nvme mailing list
Linux-nvme@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-nvme
^ permalink raw reply [flat|nested] 8+ messages in thread
end of thread, other threads:[~2021-06-14 18:15 UTC | newest]
Thread overview: 8+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-05-30 7:33 BUG: scheduling while atomic when nvmet_rdma_queue_response fails in posting a request Michal Kalderon
2021-06-08 16:50 ` Christoph Hellwig
2021-06-08 17:43 ` Sagi Grimberg
2021-06-08 18:41 ` Keith Busch
2021-06-09 0:03 ` Sagi Grimberg
2021-06-14 14:44 ` [EXT] " Michal Kalderon
2021-06-14 16:44 ` Sagi Grimberg
2021-06-14 18:14 ` Michal Kalderon
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.