* kernel panic due to a missing work initialization in case of zero kato value @ 2021-04-20 18:36 Engel, Amit 2021-04-20 22:24 ` Chaitanya Kulkarni 2021-04-21 2:32 ` kernel panic due to a missing work initialization in case of zero Hou Pu 0 siblings, 2 replies; 5+ messages in thread From: Engel, Amit @ 2021-04-20 18:36 UTC (permalink / raw) To: sagi, linux-nvme; +Cc: Engel, Amit Hello, We hit a kernel panic as a result of the below sequence: In the current nvmet implementation, as part of 'nvmet_start_keep_alive_timer' nvmet_keep_alive_timer work will be initialized only if kato != 0 when nvme connect cmd is being executed with a zero kato value 'INIT_DELAYED_WORK(&ctrl->ka_work, nvmet_keep_alive_timer)' will not be called once keep alive cmd arrives, we call 'mod_delayed_work' for a work that has not been initialized this will lead to kernel WARNING: Apr 20 10:32:59 FNM00190700796-A kernel: WARNING: CPU: 11 PID: 75133 at kernel/workqueue.c:1447 __queue_work.cold.55+0xc/0x3c And eventually to soft lockup A simple fix for this issue (I will post a patch soon) is to initialize the work (as part of 'nvmet_start_keep_alive_timer') even if kato == 0 Thanks Amit E _______________________________________________ Linux-nvme mailing list Linux-nvme@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-nvme ^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: kernel panic due to a missing work initialization in case of zero kato value 2021-04-20 18:36 kernel panic due to a missing work initialization in case of zero kato value Engel, Amit @ 2021-04-20 22:24 ` Chaitanya Kulkarni 2021-04-21 2:32 ` kernel panic due to a missing work initialization in case of zero Hou Pu 1 sibling, 0 replies; 5+ messages in thread From: Chaitanya Kulkarni @ 2021-04-20 22:24 UTC (permalink / raw) To: Engel, Amit, sagi, linux-nvme On 4/20/21 11:46, Engel, Amit wrote: > A simple fix for this issue (I will post a patch soon) is to initialize the work (as part of 'nvmet_start_keep_alive_timer') even if kato == 0 > > Thanks > Amit This may need a fixes tag. _______________________________________________ Linux-nvme mailing list Linux-nvme@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-nvme ^ permalink raw reply [flat|nested] 5+ messages in thread
* kernel panic due to a missing work initialization in case of zero 2021-04-20 18:36 kernel panic due to a missing work initialization in case of zero kato value Engel, Amit 2021-04-20 22:24 ` Chaitanya Kulkarni @ 2021-04-21 2:32 ` Hou Pu 2021-04-21 13:29 ` Engel, Amit 1 sibling, 1 reply; 5+ messages in thread From: Hou Pu @ 2021-04-21 2:32 UTC (permalink / raw) To: amit.engel; +Cc: linux-nvme, sagi On 4/20/21 11:46, Engel, Amit wrote: > Hello, > > We hit a kernel panic as a result of the below sequence: > In the current nvmet implementation, as part of 'nvmet_start_keep_alive_timer' > nvmet_keep_alive_timer work will be initialized only if kato != 0 > > when nvme connect cmd is being executed with a zero kato value > 'INIT_DELAYED_WORK(&ctrl->ka_work, nvmet_keep_alive_timer)' will not be called > > once keep alive cmd arrives, we call 'mod_delayed_work' for a work that has not been initialized > this will lead to kernel WARNING: > Apr 20 10:32:59 FNM00190700796-A kernel: WARNING: CPU: 11 PID: 75133 at kernel/workqueue.c:1447 __queue_work.cold.55+0xc/0x3c > And eventually to soft lockup Hello Engel, Could you verify this with latest nvme-5.13 branch? I think this might be the same problem as commit 7b96918a173 (nvmet: avoid queuing keep-alive timer if it is disabled) fixed. Thanks, Hou > > A simple fix for this issue (I will post a patch soon) is to initialize the work (as part of 'nvmet_start_keep_alive_timer') even if kato == 0 > > Thanks > Amit E _______________________________________________ Linux-nvme mailing list Linux-nvme@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-nvme ^ permalink raw reply [flat|nested] 5+ messages in thread
* RE: kernel panic due to a missing work initialization in case of zero 2021-04-21 2:32 ` kernel panic due to a missing work initialization in case of zero Hou Pu @ 2021-04-21 13:29 ` Engel, Amit 2021-04-22 12:16 ` Hou Pu 0 siblings, 1 reply; 5+ messages in thread From: Engel, Amit @ 2021-04-21 13:29 UTC (permalink / raw) To: Hou Pu; +Cc: linux-nvme, sagi Hi Hou, Yes, commit 7b96918a173 (nvmet: avoid queuing keep-alive timer if it is disabled) fixes the panic we hit. One comment: It might be more elegant to move INIT_DELAYED_WORK(&ctrl->ka_work, nvmet_keep_alive_timer); From nvmet_start_keep_alive_timer To nvmet_alloc_ctrl This way, we will not INIT ka_work each time the keep alive timer is started (each nvmet_set_feat_kato for example, will start_keep_alive_timer) IMO it make more sense to INIT_DELAYED_WORK only once (as part of alloc_ctrl) Let me know what you think and if you want me to provide this minor change Thanks Amit -----Original Message----- From: Hou Pu <houpu.main@gmail.com> Sent: Wednesday, April 21, 2021 5:32 AM To: Engel, Amit Cc: linux-nvme@lists.infradead.org; sagi@grimberg.me Subject: kernel panic due to a missing work initialization in case of zero [EXTERNAL EMAIL] On 4/20/21 11:46, Engel, Amit wrote: > Hello, > > We hit a kernel panic as a result of the below sequence: > In the current nvmet implementation, as part of 'nvmet_start_keep_alive_timer' > nvmet_keep_alive_timer work will be initialized only if kato != 0 > > when nvme connect cmd is being executed with a zero kato value > 'INIT_DELAYED_WORK(&ctrl->ka_work, nvmet_keep_alive_timer)' will not > be called > > once keep alive cmd arrives, we call 'mod_delayed_work' for a work > that has not been initialized this will lead to kernel WARNING: > Apr 20 10:32:59 FNM00190700796-A kernel: WARNING: CPU: 11 PID: 75133 > at kernel/workqueue.c:1447 __queue_work.cold.55+0xc/0x3c And > eventually to soft lockup Hello Engel, Could you verify this with latest nvme-5.13 branch? I think this might be the same problem as commit 7b96918a173 (nvmet: avoid queuing keep-alive timer if it is disabled) fixed. Thanks, Hou > > A simple fix for this issue (I will post a patch soon) is to > initialize the work (as part of 'nvmet_start_keep_alive_timer') even > if kato == 0 > > Thanks > Amit E _______________________________________________ Linux-nvme mailing list Linux-nvme@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-nvme ^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: kernel panic due to a missing work initialization in case of zero 2021-04-21 13:29 ` Engel, Amit @ 2021-04-22 12:16 ` Hou Pu 0 siblings, 0 replies; 5+ messages in thread From: Hou Pu @ 2021-04-22 12:16 UTC (permalink / raw) To: Engel, Amit; +Cc: linux-nvme, sagi On Wed, Apr 21, 2021 at 9:29 PM Engel, Amit <Amit.Engel@dell.com> wrote: > > Hi Hou, > Yes, commit 7b96918a173 (nvmet: avoid queuing keep-alive timer if it is disabled) fixes the panic we hit. Thanks. > > One comment: > It might be more elegant to move > INIT_DELAYED_WORK(&ctrl->ka_work, nvmet_keep_alive_timer); > From nvmet_start_keep_alive_timer To nvmet_alloc_ctrl > This way, we will not INIT ka_work each time the keep alive timer is started > (each nvmet_set_feat_kato for example, will start_keep_alive_timer) > IMO it make more sense to INIT_DELAYED_WORK only once (as part of alloc_ctrl) > > Let me know what you think and if you want me to provide this minor change > Yes, this makes more sense AFAIK. I'm OK with it. Thanks, Hou > Thanks > Amit > > -----Original Message----- > From: Hou Pu <houpu.main@gmail.com> > Sent: Wednesday, April 21, 2021 5:32 AM > To: Engel, Amit > Cc: linux-nvme@lists.infradead.org; sagi@grimberg.me > Subject: kernel panic due to a missing work initialization in case of zero > > > [EXTERNAL EMAIL] > > On 4/20/21 11:46, Engel, Amit wrote: > > Hello, > > > > We hit a kernel panic as a result of the below sequence: > > In the current nvmet implementation, as part of 'nvmet_start_keep_alive_timer' > > nvmet_keep_alive_timer work will be initialized only if kato != 0 > > > > when nvme connect cmd is being executed with a zero kato value > > 'INIT_DELAYED_WORK(&ctrl->ka_work, nvmet_keep_alive_timer)' will not > > be called > > > > once keep alive cmd arrives, we call 'mod_delayed_work' for a work > > that has not been initialized this will lead to kernel WARNING: > > Apr 20 10:32:59 FNM00190700796-A kernel: WARNING: CPU: 11 PID: 75133 > > at kernel/workqueue.c:1447 __queue_work.cold.55+0xc/0x3c And > > eventually to soft lockup > > Hello Engel, > > Could you verify this with latest nvme-5.13 branch? I think this might be the same problem as commit 7b96918a173 (nvmet: avoid queuing keep-alive timer if it is disabled) fixed. > > Thanks, > Hou > > > > > A simple fix for this issue (I will post a patch soon) is to > > initialize the work (as part of 'nvmet_start_keep_alive_timer') even > > if kato == 0 > > > > Thanks > > Amit E _______________________________________________ Linux-nvme mailing list Linux-nvme@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-nvme ^ permalink raw reply [flat|nested] 5+ messages in thread
end of thread, other threads:[~2021-04-22 12:17 UTC | newest] Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2021-04-20 18:36 kernel panic due to a missing work initialization in case of zero kato value Engel, Amit 2021-04-20 22:24 ` Chaitanya Kulkarni 2021-04-21 2:32 ` kernel panic due to a missing work initialization in case of zero Hou Pu 2021-04-21 13:29 ` Engel, Amit 2021-04-22 12:16 ` Hou Pu
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).