From mboxrd@z Thu Jan 1 00:00:00 1970 From: osmithde@cisco.com (Oliver Smith-Denny) Date: Thu, 21 Feb 2019 10:45:54 -0800 Subject: [PATCH] nvmet-fc: Bring Disconnect into compliance with FC-NVME spec In-Reply-To: <2fc8ae0b-2773-87e1-a319-55e251b3f7d7@gmail.com> References: <20190205173902.17947-1-jsmart2021@gmail.com> <20190220221454.GA31450@osmithde-lnx.cisco.com> <2fc8ae0b-2773-87e1-a319-55e251b3f7d7@gmail.com> Message-ID: <2cd0c5e9-845a-2122-e2a1-7ef3f96ce33f@cisco.com> On 02/21/2019 10:29 AM, James Smart wrote:[snip] > I plan to make another pass through the transport for spec compliance in > a couple of weeks. Part of that was ensuring all the headers and > disconnect behaviors were in sync. We should also get some of the SLER > bits in. I can roll your changes in with that or you're free to make the > suggested changes called out above. Sounds good to roll these header changes in with the rest of the spec compliance you are going to do. I agree with you on your other comments, makes sense. I have been testing with these changes and have been getting one warning (kernel/workqueue.c:3028) when the discovery controller gets NVMe_Disconnect. I also have been trying some error injection (not sending the occasional response from the target LLDD for write data) and getting blocked tasks for > 120 seconds, with the following call trace (this is after getting NVMe_Disconnect for the data controller): INFO: task kworker/27:2:35310 blocked for more than 120 seconds. Tainted: G W O 5.0.0-rc7-next-20190220+ #1 kworker/27:2 D 0 35310 2 0x80000080 Workqueue: events nvmet_fc_handle_ls_rqst_work [nvmet_fc] Call Trace: __schedule+0x2ab/0x880 ? complete+0x4d/0x60 schedule+0x36/0x70 schedule_timeout+0x1dc/0x300 complete+0x4d/0x60 nvmet_destroy_namespace+0x20/0x20 [nvmet] wait_for_completion+0x121/0x180 wake_up_q+0x80/0x80 nvmet_sq_destroy+0x4f/0xf0 [nvmet] nvmet_fc_delete_target_assoc+0x2fd/0x3f0 [nvmet_fc] nvmet_fc_handle_ls_rqst_work+0x6ad/0xa40 [nvmet_fc] process_one_work+0x179/0x3a0 worker_thread+0x4f/0x3e0 kthread+0x105/0x140 ? max_active_store+0x80/0x80 ? kthread_bind+0x20/0x20 ret_from_fork+0x35/0x40 So I will investigate this, make sure first that it is not from anything I am doing incorrectly in the LLDD. If that is not the case I will update you with fuller results. I only see this after incorporating your changes and mine (though I will start with taking out mine, since not all of them should be there). Thanks, Oliver