[-- Attachment #1.1: Type: text/plain, Size: 977 bytes --] After reviewing the error_recovery paths to address the abort on the io to cover timeouts on connecting ios, it became clear that some of the thing in the way it was coded were unnecessary. This patch set cleans up those items and refactors. It then addresses the failure case of io errors during reconnect which wants to take the create failure cases, but there is no status back from the core routines to trigger that path. Note: this assumes the patch "nvme_fc: track error_recovery while connecting" has been applied already V2: Extracted v1's patch4 (the track error_recovery patch) into it's own patch as it is a fix, not a cleanup. That patch was posted on its own. James Smart (3): nvme-fc: remove err_work work item nvme-fc: eliminate terminate_io use by nvme_fc_error_recovery nvme-fc: remove nvme_fc_terminate_io() drivers/nvme/host/fc.c | 258 ++++++++++++++++------------------------- 1 file changed, 101 insertions(+), 157 deletions(-) -- 2.26.2 [-- Attachment #1.2: S/MIME Cryptographic Signature --] [-- Type: application/pkcs7-signature, Size: 4163 bytes --] [-- Attachment #2: Type: text/plain, Size: 158 bytes --] _______________________________________________ Linux-nvme mailing list Linux-nvme@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-nvme
[-- Attachment #1.1: Type: text/plain, Size: 3838 bytes --] err_work was created to handle errors (mainly io timeout) while in CONNECTING state. The flag for err_work_active is also unneeded. Remove err_work_active and err_work. The actions to abort ios are moved inline to nvme_error_recovery(). Signed-off-by: James Smart <james.smart@broadcom.com> --- drivers/nvme/host/fc.c | 40 ++++++++++------------------------------ 1 file changed, 10 insertions(+), 30 deletions(-) diff --git a/drivers/nvme/host/fc.c b/drivers/nvme/host/fc.c index cfb6ef74c742..9cdb116b545b 100644 --- a/drivers/nvme/host/fc.c +++ b/drivers/nvme/host/fc.c @@ -158,7 +158,6 @@ struct nvme_fc_ctrl { u32 cnum; bool ioq_live; - atomic_t err_work_active; u64 association_id; struct nvmefc_ls_rcv_op *rcv_disconn; @@ -168,7 +167,6 @@ struct nvme_fc_ctrl { struct blk_mq_tag_set tag_set; struct delayed_work connect_work; - struct work_struct err_work; struct kref ref; unsigned long flags; @@ -2415,11 +2413,11 @@ nvme_fc_nvme_ctrl_freed(struct nvme_ctrl *nctrl) nvme_fc_ctrl_put(ctrl); } +static void __nvme_fc_terminate_io(struct nvme_fc_ctrl *ctrl); + static void nvme_fc_error_recovery(struct nvme_fc_ctrl *ctrl, char *errmsg) { - int active; - /* * if an error (io timeout, etc) while (re)connecting, * it's an error on creating the new association. @@ -2428,11 +2426,14 @@ nvme_fc_error_recovery(struct nvme_fc_ctrl *ctrl, char *errmsg) * ios hitting this path before things are cleaned up. */ if (ctrl->ctrl.state == NVME_CTRL_CONNECTING) { - active = atomic_xchg(&ctrl->err_work_active, 1); - if (!active && !queue_work(nvme_fc_wq, &ctrl->err_work)) { - atomic_set(&ctrl->err_work_active, 0); - WARN_ON(1); - } + __nvme_fc_terminate_io(ctrl); + + /* + * Rescheduling the connection after recovering + * from the io error is left to the reconnect work + * item, which is what should have stalled waiting on + * the io that had the error that scheduled this work. + */ return; } @@ -3240,7 +3241,6 @@ nvme_fc_delete_ctrl(struct nvme_ctrl *nctrl) { struct nvme_fc_ctrl *ctrl = to_fc_ctrl(nctrl); - cancel_work_sync(&ctrl->err_work); cancel_delayed_work_sync(&ctrl->connect_work); /* * kill the association on the link side. this will block @@ -3351,23 +3351,6 @@ nvme_fc_reset_ctrl_work(struct work_struct *work) ctrl->cnum); } -static void -nvme_fc_connect_err_work(struct work_struct *work) -{ - struct nvme_fc_ctrl *ctrl = - container_of(work, struct nvme_fc_ctrl, err_work); - - __nvme_fc_terminate_io(ctrl); - - atomic_set(&ctrl->err_work_active, 0); - - /* - * Rescheduling the connection after recovering - * from the io error is left to the reconnect work - * item, which is what should have stalled waiting on - * the io that had the error that scheduled this work. - */ -} static const struct nvme_ctrl_ops nvme_fc_ctrl_ops = { .name = "fc", @@ -3495,7 +3478,6 @@ nvme_fc_init_ctrl(struct device *dev, struct nvmf_ctrl_options *opts, ctrl->dev = lport->dev; ctrl->cnum = idx; ctrl->ioq_live = false; - atomic_set(&ctrl->err_work_active, 0); init_waitqueue_head(&ctrl->ioabort_wait); get_device(ctrl->dev); @@ -3503,7 +3485,6 @@ nvme_fc_init_ctrl(struct device *dev, struct nvmf_ctrl_options *opts, INIT_WORK(&ctrl->ctrl.reset_work, nvme_fc_reset_ctrl_work); INIT_DELAYED_WORK(&ctrl->connect_work, nvme_fc_connect_ctrl_work); - INIT_WORK(&ctrl->err_work, nvme_fc_connect_err_work); spin_lock_init(&ctrl->lock); /* io queue count */ @@ -3596,7 +3577,6 @@ nvme_fc_init_ctrl(struct device *dev, struct nvmf_ctrl_options *opts, fail_ctrl: nvme_change_ctrl_state(&ctrl->ctrl, NVME_CTRL_DELETING); cancel_work_sync(&ctrl->ctrl.reset_work); - cancel_work_sync(&ctrl->err_work); cancel_delayed_work_sync(&ctrl->connect_work); ctrl->ctrl.opts = NULL; -- 2.26.2 [-- Attachment #1.2: S/MIME Cryptographic Signature --] [-- Type: application/pkcs7-signature, Size: 4163 bytes --] [-- Attachment #2: Type: text/plain, Size: 158 bytes --] _______________________________________________ Linux-nvme mailing list Linux-nvme@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-nvme
[-- Attachment #1.1: Type: text/plain, Size: 9682 bytes --] nvme_fc_error_recovery() special cases handling when in CONNECTING state and calls __nvme_fc_terminate_io(). __nvme_fc_terminate_io() itself special cases CONNECTING state and calls the routine to abort outstanding ios. Simplify the sequence by putting the call to abort outstanding ios directly in nvme_fc_error_recovery. Move the location of __nvme_fc_abort_outstanding_ios(), and nvme_fc_terminate_exchange() which is called by it, to avoid adding function prototypes for nvme_fc_error_recovery(). Signed-off-by: James Smart <james.smart@broadcom.com> --- drivers/nvme/host/fc.c | 187 ++++++++++++++++++----------------------- 1 file changed, 84 insertions(+), 103 deletions(-) diff --git a/drivers/nvme/host/fc.c b/drivers/nvme/host/fc.c index 9cdb116b545b..ffbfb0533cac 100644 --- a/drivers/nvme/host/fc.c +++ b/drivers/nvme/host/fc.c @@ -2413,27 +2413,97 @@ nvme_fc_nvme_ctrl_freed(struct nvme_ctrl *nctrl) nvme_fc_ctrl_put(ctrl); } -static void __nvme_fc_terminate_io(struct nvme_fc_ctrl *ctrl); +/* + * This routine is used by the transport when it needs to find active + * io on a queue that is to be terminated. The transport uses + * blk_mq_tagset_busy_itr() to find the busy requests, which then invoke + * this routine to kill them on a 1 by 1 basis. + * + * As FC allocates FC exchange for each io, the transport must contact + * the LLDD to terminate the exchange, thus releasing the FC exchange. + * After terminating the exchange the LLDD will call the transport's + * normal io done path for the request, but it will have an aborted + * status. The done path will return the io request back to the block + * layer with an error status. + */ +static bool +nvme_fc_terminate_exchange(struct request *req, void *data, bool reserved) +{ + struct nvme_ctrl *nctrl = data; + struct nvme_fc_ctrl *ctrl = to_fc_ctrl(nctrl); + struct nvme_fc_fcp_op *op = blk_mq_rq_to_pdu(req); + + __nvme_fc_abort_op(ctrl, op); + return true; +} + +/* + * This routine runs through all outstanding commands on the association + * and aborts them. This routine is typically be called by the + * delete_association routine. It is also called due to an error during + * reconnect. In that scenario, it is most likely a command that initializes + * the controller, including fabric Connect commands on io queues, that + * may have timed out or failed thus the io must be killed for the connect + * thread to see the error. + */ +static void +__nvme_fc_abort_outstanding_ios(struct nvme_fc_ctrl *ctrl, bool start_queues) +{ + /* + * If io queues are present, stop them and terminate all outstanding + * ios on them. As FC allocates FC exchange for each io, the + * transport must contact the LLDD to terminate the exchange, + * thus releasing the FC exchange. We use blk_mq_tagset_busy_itr() + * to tell us what io's are busy and invoke a transport routine + * to kill them with the LLDD. After terminating the exchange + * the LLDD will call the transport's normal io done path, but it + * will have an aborted status. The done path will return the + * io requests back to the block layer as part of normal completions + * (but with error status). + */ + if (ctrl->ctrl.queue_count > 1) { + nvme_stop_queues(&ctrl->ctrl); + blk_mq_tagset_busy_iter(&ctrl->tag_set, + nvme_fc_terminate_exchange, &ctrl->ctrl); + blk_mq_tagset_wait_completed_request(&ctrl->tag_set); + if (start_queues) + nvme_start_queues(&ctrl->ctrl); + } + + /* + * Other transports, which don't have link-level contexts bound + * to sqe's, would try to gracefully shutdown the controller by + * writing the registers for shutdown and polling (call + * nvme_shutdown_ctrl()). Given a bunch of i/o was potentially + * just aborted and we will wait on those contexts, and given + * there was no indication of how live the controlelr is on the + * link, don't send more io to create more contexts for the + * shutdown. Let the controller fail via keepalive failure if + * its still present. + */ + + /* + * clean up the admin queue. Same thing as above. + */ + blk_mq_quiesce_queue(ctrl->ctrl.admin_q); + blk_mq_tagset_busy_iter(&ctrl->admin_tag_set, + nvme_fc_terminate_exchange, &ctrl->ctrl); + blk_mq_tagset_wait_completed_request(&ctrl->admin_tag_set); +} static void nvme_fc_error_recovery(struct nvme_fc_ctrl *ctrl, char *errmsg) { /* - * if an error (io timeout, etc) while (re)connecting, - * it's an error on creating the new association. - * Start the error recovery thread if it hasn't already - * been started. It is expected there could be multiple - * ios hitting this path before things are cleaned up. + * if an error (io timeout, etc) while (re)connecting, the remote + * port requested terminating of the association (disconnect_ls) + * or an error (timeout or abort) occurred on an io while creating + * the controller. Abort any ios on the association and let the + * create_association error path resolve things. */ if (ctrl->ctrl.state == NVME_CTRL_CONNECTING) { - __nvme_fc_terminate_io(ctrl); - - /* - * Rescheduling the connection after recovering - * from the io error is left to the reconnect work - * item, which is what should have stalled waiting on - * the io that had the error that scheduled this work. - */ + __nvme_fc_abort_outstanding_ios(ctrl, true); + set_bit(ASSOC_FAILED, &ctrl->flags); return; } @@ -2747,30 +2817,6 @@ nvme_fc_complete_rq(struct request *rq) nvme_fc_ctrl_put(ctrl); } -/* - * This routine is used by the transport when it needs to find active - * io on a queue that is to be terminated. The transport uses - * blk_mq_tagset_busy_itr() to find the busy requests, which then invoke - * this routine to kill them on a 1 by 1 basis. - * - * As FC allocates FC exchange for each io, the transport must contact - * the LLDD to terminate the exchange, thus releasing the FC exchange. - * After terminating the exchange the LLDD will call the transport's - * normal io done path for the request, but it will have an aborted - * status. The done path will return the io request back to the block - * layer with an error status. - */ -static bool -nvme_fc_terminate_exchange(struct request *req, void *data, bool reserved) -{ - struct nvme_ctrl *nctrl = data; - struct nvme_fc_ctrl *ctrl = to_fc_ctrl(nctrl); - struct nvme_fc_fcp_op *op = blk_mq_rq_to_pdu(req); - - __nvme_fc_abort_op(ctrl, op); - return true; -} - static const struct blk_mq_ops nvme_fc_mq_ops = { .queue_rq = nvme_fc_queue_rq, @@ -3111,60 +3157,6 @@ nvme_fc_create_association(struct nvme_fc_ctrl *ctrl) } -/* - * This routine runs through all outstanding commands on the association - * and aborts them. This routine is typically be called by the - * delete_association routine. It is also called due to an error during - * reconnect. In that scenario, it is most likely a command that initializes - * the controller, including fabric Connect commands on io queues, that - * may have timed out or failed thus the io must be killed for the connect - * thread to see the error. - */ -static void -__nvme_fc_abort_outstanding_ios(struct nvme_fc_ctrl *ctrl, bool start_queues) -{ - /* - * If io queues are present, stop them and terminate all outstanding - * ios on them. As FC allocates FC exchange for each io, the - * transport must contact the LLDD to terminate the exchange, - * thus releasing the FC exchange. We use blk_mq_tagset_busy_itr() - * to tell us what io's are busy and invoke a transport routine - * to kill them with the LLDD. After terminating the exchange - * the LLDD will call the transport's normal io done path, but it - * will have an aborted status. The done path will return the - * io requests back to the block layer as part of normal completions - * (but with error status). - */ - if (ctrl->ctrl.queue_count > 1) { - nvme_stop_queues(&ctrl->ctrl); - blk_mq_tagset_busy_iter(&ctrl->tag_set, - nvme_fc_terminate_exchange, &ctrl->ctrl); - blk_mq_tagset_wait_completed_request(&ctrl->tag_set); - if (start_queues) - nvme_start_queues(&ctrl->ctrl); - } - - /* - * Other transports, which don't have link-level contexts bound - * to sqe's, would try to gracefully shutdown the controller by - * writing the registers for shutdown and polling (call - * nvme_shutdown_ctrl()). Given a bunch of i/o was potentially - * just aborted and we will wait on those contexts, and given - * there was no indication of how live the controlelr is on the - * link, don't send more io to create more contexts for the - * shutdown. Let the controller fail via keepalive failure if - * its still present. - */ - - /* - * clean up the admin queue. Same thing as above. - */ - blk_mq_quiesce_queue(ctrl->ctrl.admin_q); - blk_mq_tagset_busy_iter(&ctrl->admin_tag_set, - nvme_fc_terminate_exchange, &ctrl->ctrl); - blk_mq_tagset_wait_completed_request(&ctrl->admin_tag_set); -} - /* * This routine stops operation of the controller on the host side. * On the host os stack side: Admin and IO queues are stopped, @@ -3297,17 +3289,6 @@ nvme_fc_reconnect_or_delete(struct nvme_fc_ctrl *ctrl, int status) static void __nvme_fc_terminate_io(struct nvme_fc_ctrl *ctrl) { - /* - * if state is CONNECTING - the error occurred as part of a - * reconnect attempt. Abort any ios on the association and - * let the create_association error paths resolve things. - */ - if (ctrl->ctrl.state == NVME_CTRL_CONNECTING) { - __nvme_fc_abort_outstanding_ios(ctrl, true); - set_bit(ASSOC_FAILED, &ctrl->flags); - return; - } - /* * For any other state, kill the association. As this routine * is a common io abort routine for resetting and such, after -- 2.26.2 [-- Attachment #1.2: S/MIME Cryptographic Signature --] [-- Type: application/pkcs7-signature, Size: 4163 bytes --] [-- Attachment #2: Type: text/plain, Size: 158 bytes --] _______________________________________________ Linux-nvme mailing list Linux-nvme@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-nvme
[-- Attachment #1.1: Type: text/plain, Size: 2665 bytes --] __nvme_fc_terminate_io() is now called by only 1 place, in reset_work. Consoldate and move the functionality of terminate_io into reset_work. In reset_work, rather than calling the create_association directly, schedule the connect work element to do its thing. After scheduling, flush the connect work element to continue with semantic of not returning until connect has been attempted at least once. Signed-off-by: James Smart <james.smart@broadcom.com> --- drivers/nvme/host/fc.c | 49 ++++++++++++++---------------------------- 1 file changed, 16 insertions(+), 33 deletions(-) diff --git a/drivers/nvme/host/fc.c b/drivers/nvme/host/fc.c index ffbfb0533cac..f4c246462658 100644 --- a/drivers/nvme/host/fc.c +++ b/drivers/nvme/host/fc.c @@ -3287,49 +3287,32 @@ nvme_fc_reconnect_or_delete(struct nvme_fc_ctrl *ctrl, int status) } static void -__nvme_fc_terminate_io(struct nvme_fc_ctrl *ctrl) +nvme_fc_reset_ctrl_work(struct work_struct *work) { - /* - * For any other state, kill the association. As this routine - * is a common io abort routine for resetting and such, after - * the association is terminated, ensure that the state is set - * to CONNECTING. - */ + struct nvme_fc_ctrl *ctrl = + container_of(work, struct nvme_fc_ctrl, ctrl.reset_work); - nvme_stop_keep_alive(&ctrl->ctrl); + nvme_stop_ctrl(&ctrl->ctrl); /* will block will waiting for io to terminate */ nvme_fc_delete_association(ctrl); - if (ctrl->ctrl.state != NVME_CTRL_CONNECTING && - !nvme_change_ctrl_state(&ctrl->ctrl, NVME_CTRL_CONNECTING)) + if (!nvme_change_ctrl_state(&ctrl->ctrl, NVME_CTRL_CONNECTING)) dev_err(ctrl->ctrl.device, "NVME-FC{%d}: error_recovery: Couldn't change state " "to CONNECTING\n", ctrl->cnum); -} - -static void -nvme_fc_reset_ctrl_work(struct work_struct *work) -{ - struct nvme_fc_ctrl *ctrl = - container_of(work, struct nvme_fc_ctrl, ctrl.reset_work); - int ret; - - __nvme_fc_terminate_io(ctrl); - nvme_stop_ctrl(&ctrl->ctrl); - - if (ctrl->rport->remoteport.port_state == FC_OBJSTATE_ONLINE) - ret = nvme_fc_create_association(ctrl); - else - ret = -ENOTCONN; - - if (ret) - nvme_fc_reconnect_or_delete(ctrl, ret); - else - dev_info(ctrl->ctrl.device, - "NVME-FC{%d}: controller reset complete\n", - ctrl->cnum); + if (ctrl->rport->remoteport.port_state == FC_OBJSTATE_ONLINE) { + if (!queue_delayed_work(nvme_wq, &ctrl->connect_work, 0)) { + dev_err(ctrl->ctrl.device, + "NVME-FC{%d}: failed to schedule connect " + "after reset\n", ctrl->cnum); + } else { + flush_delayed_work(&ctrl->connect_work); + } + } else { + nvme_fc_reconnect_or_delete(ctrl, -ENOTCONN); + } } -- 2.26.2 [-- Attachment #1.2: S/MIME Cryptographic Signature --] [-- Type: application/pkcs7-signature, Size: 4163 bytes --] [-- Attachment #2: Type: text/plain, Size: 158 bytes --] _______________________________________________ Linux-nvme mailing list Linux-nvme@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-nvme
Thanks, applied to nvme-5.10. _______________________________________________ Linux-nvme mailing list Linux-nvme@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-nvme
Hi James, On Fri, Oct 23, 2020 at 03:27:51PM -0700, James Smart wrote: > nvme_fc_error_recovery() special cases handling when in CONNECTING state > and calls __nvme_fc_terminate_io(). __nvme_fc_terminate_io() itself > special cases CONNECTING state and calls the routine to abort outstanding > ios. > > Simplify the sequence by putting the call to abort outstanding ios directly > in nvme_fc_error_recovery. > > Move the location of __nvme_fc_abort_outstanding_ios(), and > nvme_fc_terminate_exchange() which is called by it, to avoid adding > function prototypes for nvme_fc_error_recovery(). During local testing I run into this problem: BUG: scheduling while atomic: swapper/37/0/0x00000100 Modules linked in: iscsi_ibft(E) iscsi_boot_sysfs(E) rfkill(E) intel_rapl_msr(E) intel_rapl_common(E) sb_edac(E) x86_pkg_temp_thermal(E) intel_powerclamp(E) ext4(E) nls_iso8859_1(E) coretemp(E) nls_cp437(E) crc16(E) kvm_intel(E) mbcache(E) jbd2(E) kvm(E) vfat(E) irqbypass(E) crc32_pclmul(E) fat(E) ghash_clmulni_intel(E) iTCO_wdt(E) lpfc(E) iTCO_vendor_support(E) aesni_intel(E) nvmet_fc(E) aes_x86_64(E) ipmi_ssif(E) crypto_simd(E) nvmet(E) bnx2x(E) cryptd(E) glue_helper(E) pcspkr(E) lpc_ich(E) ipmi_si(E) tg3(E) mdio(E) ioatdma(E) hpilo(E) mfd_core(E) hpwdt(E) ipmi_devintf(E) configfs(E) libphy(E) dca(E) ipmi_msghandler(E) button(E) btrfs(E) libcrc32c(E) xor(E) raid6_pq(E) mgag200(E) drm_vram_helper(E) sd_mod(E) ttm(E) i2c_algo_bit(E) qla2xxx(E) drm_kms_helper(E) syscopyarea(E) nvme_fc(E) sysfillrect(E) sysimgblt(E) nvme_fabrics(E) uhci_hcd(E) fb_sys_fops(E) ehci_pci(E) ehci_hcd(E) nvme_core(E) crc32c_intel(E) scsi_transport_fc(E) drm(E) usbcore(E) hpsa(E) scsi_transport_sas(E) wmi(E) sg(E) dm_multipath(E) dm_mod(E) scsi_dh_rdac(E) scsi_dh_emc(E) scsi_dh_alua(E) scsi_mod(E) efivarfs(E) Supported: No, Unreleased kernel CPU: 37 PID: 0 Comm: swapper/37 Tainted: G EL 5.3.18-0.g7362c5c-default #1 SLE15-SP2 (unreleased) Hardware name: HP ProLiant DL580 Gen9/ProLiant DL580 Gen9, BIOS U17 10/21/2019 Call Trace: <IRQ> dump_stack+0x66/0x8b __schedule_bug+0x51/0x70 __schedule+0x697/0x750 schedule+0x2f/0xa0 schedule_timeout+0x1dd/0x300 ? lpfc_sli4_fp_handle_fcp_wcqe.isra.31+0x146/0x390 [lpfc] ? update_group_capacity+0x25/0x1b0 wait_for_completion+0xba/0x140 ? wake_up_q+0xa0/0xa0 __wait_rcu_gp+0x110/0x130 synchronize_rcu+0x55/0x80 ? __call_rcu+0x4e0/0x4e0 ? __bpf_trace_rcu_invoke_callback+0x10/0x10 __nvme_fc_abort_outstanding_ios+0x5f/0x90 [nvme_fc] nvme_fc_error_recovery+0x25/0x70 [nvme_fc] nvme_fc_fcpio_done+0x243/0x400 [nvme_fc] lpfc_sli4_nvme_xri_aborted+0x62/0x100 [lpfc] lpfc_sli4_sp_handle_abort_xri_wcqe.isra.56+0x4c/0x170 [lpfc] ? lpfc_sli4_fp_handle_cqe+0x8b/0x490 [lpfc] lpfc_sli4_fp_handle_cqe+0x8b/0x490 [lpfc] __lpfc_sli4_process_cq+0xfd/0x270 [lpfc] ? lpfc_sli4_sp_handle_abort_xri_wcqe.isra.56+0x170/0x170 [lpfc] __lpfc_sli4_hba_process_cq+0x3c/0x110 [lpfc] lpfc_cq_poll_hdler+0x16/0x20 [lpfc] irq_poll_softirq+0x88/0x110 __do_softirq+0xe3/0x2dc irq_exit+0xd5/0xe0 do_IRQ+0x7f/0xd0 common_interrupt+0xf/0xf </IRQ> I think we can't move the __nvme_fc_abort_outstanding_ios() into this path as we are still running in IRQ context. Thanks, Daniel _______________________________________________ Linux-nvme mailing list Linux-nvme@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-nvme
[-- Attachment #1.1: Type: text/plain, Size: 3740 bytes --] On 11/19/2020 2:51 AM, Daniel Wagner wrote: > Hi James, > > On Fri, Oct 23, 2020 at 03:27:51PM -0700, James Smart wrote: >> nvme_fc_error_recovery() special cases handling when in CONNECTING state >> and calls __nvme_fc_terminate_io(). __nvme_fc_terminate_io() itself >> special cases CONNECTING state and calls the routine to abort outstanding >> ios. >> >> Simplify the sequence by putting the call to abort outstanding ios directly >> in nvme_fc_error_recovery. >> >> Move the location of __nvme_fc_abort_outstanding_ios(), and >> nvme_fc_terminate_exchange() which is called by it, to avoid adding >> function prototypes for nvme_fc_error_recovery(). > During local testing I run into this problem: > > BUG: scheduling while atomic: swapper/37/0/0x00000100 > Modules linked in: iscsi_ibft(E) iscsi_boot_sysfs(E) rfkill(E) intel_rapl_msr(E) intel_rapl_common(E) sb_edac(E) x86_pkg_temp_thermal(E) intel_powerclamp(E) ext4(E) nls_iso8859_1(E) coretemp(E) nls_cp437(E) crc16(E) kvm_intel(E) mbcache(E) jbd2(E) kvm(E) vfat(E) irqbypass(E) crc32_pclmul(E) fat(E) ghash_clmulni_intel(E) iTCO_wdt(E) lpfc(E) iTCO_vendor_support(E) aesni_intel(E) nvmet_fc(E) aes_x86_64(E) ipmi_ssif(E) crypto_simd(E) nvmet(E) bnx2x(E) cryptd(E) glue_helper(E) pcspkr(E) lpc_ich(E) ipmi_si(E) tg3(E) mdio(E) ioatdma(E) hpilo(E) mfd_core(E) hpwdt(E) ipmi_devintf(E) configfs(E) libphy(E) dca(E) ipmi_msghandler(E) button(E) btrfs(E) libcrc32c(E) xor(E) raid6_pq(E) mgag200(E) drm_vram_helper(E) sd_mod(E) ttm(E) i2c_algo_bit(E) qla2xxx(E) drm_kms_helper(E) syscopyarea(E) nvme_fc(E) sysfillrect(E) sysimgblt(E) nvme_fabrics(E) uhci_hcd(E) fb_sys_fops(E) ehci_pci(E) ehci_hcd(E) nvme_core(E) crc32c_intel(E) scsi_transport_fc(E) drm(E) usbcore(E) hpsa(E) scsi_transport_sas(E) > wmi(E) sg(E) dm_multipath(E) dm_mod(E) scsi_dh_rdac(E) scsi_dh_emc(E) scsi_dh_alua(E) scsi_mod(E) efivarfs(E) > Supported: No, Unreleased kernel > CPU: 37 PID: 0 Comm: swapper/37 Tainted: G EL 5.3.18-0.g7362c5c-default #1 SLE15-SP2 (unreleased) > Hardware name: HP ProLiant DL580 Gen9/ProLiant DL580 Gen9, BIOS U17 10/21/2019 > Call Trace: > <IRQ> > dump_stack+0x66/0x8b > __schedule_bug+0x51/0x70 > __schedule+0x697/0x750 > schedule+0x2f/0xa0 > schedule_timeout+0x1dd/0x300 > ? lpfc_sli4_fp_handle_fcp_wcqe.isra.31+0x146/0x390 [lpfc] > ? update_group_capacity+0x25/0x1b0 > wait_for_completion+0xba/0x140 > ? wake_up_q+0xa0/0xa0 > __wait_rcu_gp+0x110/0x130 > synchronize_rcu+0x55/0x80 > ? __call_rcu+0x4e0/0x4e0 > ? __bpf_trace_rcu_invoke_callback+0x10/0x10 > __nvme_fc_abort_outstanding_ios+0x5f/0x90 [nvme_fc] > nvme_fc_error_recovery+0x25/0x70 [nvme_fc] > nvme_fc_fcpio_done+0x243/0x400 [nvme_fc] > lpfc_sli4_nvme_xri_aborted+0x62/0x100 [lpfc] > lpfc_sli4_sp_handle_abort_xri_wcqe.isra.56+0x4c/0x170 [lpfc] > ? lpfc_sli4_fp_handle_cqe+0x8b/0x490 [lpfc] > lpfc_sli4_fp_handle_cqe+0x8b/0x490 [lpfc] > __lpfc_sli4_process_cq+0xfd/0x270 [lpfc] > ? lpfc_sli4_sp_handle_abort_xri_wcqe.isra.56+0x170/0x170 [lpfc] > __lpfc_sli4_hba_process_cq+0x3c/0x110 [lpfc] > lpfc_cq_poll_hdler+0x16/0x20 [lpfc] > irq_poll_softirq+0x88/0x110 > __do_softirq+0xe3/0x2dc > irq_exit+0xd5/0xe0 > do_IRQ+0x7f/0xd0 > common_interrupt+0xf/0xf > </IRQ> > > > I think we can't move the __nvme_fc_abort_outstanding_ios() into this > path as we are still running in IRQ context. > > Thanks, > Daniel > Daniel, I agree with you. This was brought about by lpfc converting to use to blk_irq poll. I'll put something together for the transport, as it is likely reasonable to expect use of blk_irq -- james [-- Attachment #1.2: S/MIME Cryptographic Signature --] [-- Type: application/pkcs7-signature, Size: 4163 bytes --] [-- Attachment #2: Type: text/plain, Size: 158 bytes --] _______________________________________________ Linux-nvme mailing list Linux-nvme@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-nvme