* [PATCH] nvme: fix APST error for power latency tolerance @ 2021-03-23 7:31 pngliu 2021-03-23 16:23 ` Christoph Hellwig 0 siblings, 1 reply; 6+ messages in thread From: pngliu @ 2021-03-23 7:31 UTC (permalink / raw) To: kbusch; +Cc: linux-nvme, Peng Liu From: Peng Liu <liupeng17@lenovo.com> Clear apsta so that nvme_configure_apst() does not execute nvme_set_features(), which will fail because admin_q is either not set up yet or no longer available at the time of nvme_uninit_ctrl() being called, and this leads to the error message "nvme nvme0: failed to set APST feature (-19)". Fixes: 510a405d945b("nvme: fix memory leak for power latency tolerance") Signed-off-by: Peng Liu <liupeng17@lenovo.com> --- drivers/nvme/host/core.c | 4 ++++ 1 file changed, 4 insertions(+) diff --git a/drivers/nvme/host/core.c b/drivers/nvme/host/core.c index e68a8c4ac5a6..413a33c67655 100644 --- a/drivers/nvme/host/core.c +++ b/drivers/nvme/host/core.c @@ -4475,7 +4475,11 @@ void nvme_uninit_ctrl(struct nvme_ctrl *ctrl) { nvme_hwmon_exit(ctrl); nvme_fault_inject_fini(&ctrl->fault_inject); + + /* clear APSTA since admin_q is unavailable for feature setting */ + ctrl->apsta = 0; dev_pm_qos_hide_latency_tolerance(ctrl->device); + cdev_device_del(&ctrl->cdev, ctrl->device); nvme_put_ctrl(ctrl); } -- 2.25.1 _______________________________________________ Linux-nvme mailing list Linux-nvme@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-nvme ^ permalink raw reply related [flat|nested] 6+ messages in thread
* Re: [PATCH] nvme: fix APST error for power latency tolerance 2021-03-23 7:31 [PATCH] nvme: fix APST error for power latency tolerance pngliu @ 2021-03-23 16:23 ` Christoph Hellwig 2021-03-24 2:38 ` Peng Liu 0 siblings, 1 reply; 6+ messages in thread From: Christoph Hellwig @ 2021-03-23 16:23 UTC (permalink / raw) To: pngliu; +Cc: kbusch, linux-nvme, Peng Liu On Tue, Mar 23, 2021 at 03:31:33PM +0800, pngliu@hotmail.com wrote: > From: Peng Liu <liupeng17@lenovo.com> > > Clear apsta so that nvme_configure_apst() does not execute > nvme_set_features(), which will fail because admin_q is either not set up > yet or no longer available at the time of nvme_uninit_ctrl() being called, > and this leads to the error message "nvme nvme0: failed to set APST feature > (-19)". > > Fixes: 510a405d945b("nvme: fix memory leak for power latency tolerance") How did you get into this situation? For PCIe nvme_uninit_ctrl is only called at the end of ->remove and ->delete_ctrl, so how do we end up in nvme_configure_apst after that? _______________________________________________ Linux-nvme mailing list Linux-nvme@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-nvme ^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [PATCH] nvme: fix APST error for power latency tolerance 2021-03-23 16:23 ` Christoph Hellwig @ 2021-03-24 2:38 ` Peng Liu 2021-03-24 7:58 ` Christoph Hellwig 0 siblings, 1 reply; 6+ messages in thread From: Peng Liu @ 2021-03-24 2:38 UTC (permalink / raw) To: Christoph Hellwig; +Cc: kbusch, linux-nvme, Peng Liu On Tue, Mar 23, 2021 at 04:23:21PM +0000, Christoph Hellwig wrote: > On Tue, Mar 23, 2021 at 03:31:33PM +0800, pngliu@hotmail.com wrote: > > From: Peng Liu <liupeng17@lenovo.com> > > > > Clear apsta so that nvme_configure_apst() does not execute > > nvme_set_features(), which will fail because admin_q is either not set up > > yet or no longer available at the time of nvme_uninit_ctrl() being called, > > and this leads to the error message "nvme nvme0: failed to set APST feature > > (-19)". > > > > Fixes: 510a405d945b("nvme: fix memory leak for power latency tolerance") > > How did you get into this situation? For PCIe nvme_uninit_ctrl is > only called at the end of ->remove and ->delete_ctrl, so how do we end > up in nvme_configure_apst after that? I got into it with nvme surprise and non-surprise hot-removal tests. Below is the stack ftrace result for nvme_configure_apst under the surprise hot-removal, and it is similar for the non-surprise hot-removal. irq/165-pciehp-939 [039] .... 1529.443824: nvme_configure_apst <-nvme_set_latency_tolerance irq/165-pciehp-939 [039] .... 1529.443836: <stack trace> => nvme_configure_apst => nvme_set_latency_tolerance => apply_constraint => __dev_pm_qos_remove_request => __dev_pm_qos_drop_user_request.isra.7 => dev_pm_qos_update_user_latency_tolerance => dev_pm_qos_hide_latency_tolerance => nvme_uninit_ctrl => nvme_remove => pci_device_remove => device_release_driver_internal => device_release_driver => pci_stop_bus_device => pci_stop_and_remove_bus_device => pciehp_unconfigure_device => pciehp_disable_slot => pciehp_handle_disable_request => pciehp_ist => irq_thread_fn => irq_thread => kthread => ret_from_fork _______________________________________________ Linux-nvme mailing list Linux-nvme@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-nvme ^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [PATCH] nvme: fix APST error for power latency tolerance 2021-03-24 2:38 ` Peng Liu @ 2021-03-24 7:58 ` Christoph Hellwig 2021-04-09 7:19 ` Christoph Hellwig 0 siblings, 1 reply; 6+ messages in thread From: Christoph Hellwig @ 2021-03-24 7:58 UTC (permalink / raw) To: Peng Liu; +Cc: Christoph Hellwig, kbusch, linux-nvme, Peng Liu On Wed, Mar 24, 2021 at 10:38:12AM +0800, Peng Liu wrote: > On Tue, Mar 23, 2021 at 04:23:21PM +0000, Christoph Hellwig wrote: > > On Tue, Mar 23, 2021 at 03:31:33PM +0800, pngliu@hotmail.com wrote: > > > From: Peng Liu <liupeng17@lenovo.com> > > > > > > Clear apsta so that nvme_configure_apst() does not execute > > > nvme_set_features(), which will fail because admin_q is either not set up > > > yet or no longer available at the time of nvme_uninit_ctrl() being called, > > > and this leads to the error message "nvme nvme0: failed to set APST feature > > > (-19)". > > > > > > Fixes: 510a405d945b("nvme: fix memory leak for power latency tolerance") > > > > How did you get into this situation? For PCIe nvme_uninit_ctrl is > > only called at the end of ->remove and ->delete_ctrl, so how do we end > > up in nvme_configure_apst after that? > > I got into it with nvme surprise and non-surprise hot-removal tests. > Below is the stack ftrace result for nvme_configure_apst under the > surprise hot-removal, and it is similar for the non-surprise hot-removal. Ok, looks like dev_pm_qos_hide_latency_tolerance calls back into nvme_set_latency_tolerance, which is a little .. unexpected. Does this patch work for you? diff --git a/drivers/nvme/host/core.c b/drivers/nvme/host/core.c index 0896e21642beba..d5d7e0cdd78d80 100644 --- a/drivers/nvme/host/core.c +++ b/drivers/nvme/host/core.c @@ -2681,7 +2681,8 @@ static void nvme_set_latency_tolerance(struct device *dev, s32 val) if (ctrl->ps_max_latency_us != latency) { ctrl->ps_max_latency_us = latency; - nvme_configure_apst(ctrl); + if (ctrl->state == NVME_CTRL_LIVE) + nvme_configure_apst(ctrl); } } _______________________________________________ Linux-nvme mailing list Linux-nvme@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-nvme ^ permalink raw reply related [flat|nested] 6+ messages in thread
* Re: [PATCH] nvme: fix APST error for power latency tolerance 2021-03-24 7:58 ` Christoph Hellwig @ 2021-04-09 7:19 ` Christoph Hellwig 2021-04-09 8:12 ` Peng Liu 0 siblings, 1 reply; 6+ messages in thread From: Christoph Hellwig @ 2021-04-09 7:19 UTC (permalink / raw) To: Peng Liu; +Cc: Christoph Hellwig, kbusch, linux-nvme, Peng Liu On Wed, Mar 24, 2021 at 07:58:24AM +0000, Christoph Hellwig wrote: > On Wed, Mar 24, 2021 at 10:38:12AM +0800, Peng Liu wrote: > > On Tue, Mar 23, 2021 at 04:23:21PM +0000, Christoph Hellwig wrote: > > > On Tue, Mar 23, 2021 at 03:31:33PM +0800, pngliu@hotmail.com wrote: > > > > From: Peng Liu <liupeng17@lenovo.com> > > > > > > > > Clear apsta so that nvme_configure_apst() does not execute > > > > nvme_set_features(), which will fail because admin_q is either not set up > > > > yet or no longer available at the time of nvme_uninit_ctrl() being called, > > > > and this leads to the error message "nvme nvme0: failed to set APST feature > > > > (-19)". > > > > > > > > Fixes: 510a405d945b("nvme: fix memory leak for power latency tolerance") > > > > > > How did you get into this situation? For PCIe nvme_uninit_ctrl is > > > only called at the end of ->remove and ->delete_ctrl, so how do we end > > > up in nvme_configure_apst after that? > > > > I got into it with nvme surprise and non-surprise hot-removal tests. > > Below is the stack ftrace result for nvme_configure_apst under the > > surprise hot-removal, and it is similar for the non-surprise hot-removal. > > Ok, looks like dev_pm_qos_hide_latency_tolerance calls back into > nvme_set_latency_tolerance, which is a little .. unexpected. > > Does this patch work for you? ping? > > diff --git a/drivers/nvme/host/core.c b/drivers/nvme/host/core.c > index 0896e21642beba..d5d7e0cdd78d80 100644 > --- a/drivers/nvme/host/core.c > +++ b/drivers/nvme/host/core.c > @@ -2681,7 +2681,8 @@ static void nvme_set_latency_tolerance(struct device *dev, s32 val) > > if (ctrl->ps_max_latency_us != latency) { > ctrl->ps_max_latency_us = latency; > - nvme_configure_apst(ctrl); > + if (ctrl->state == NVME_CTRL_LIVE) > + nvme_configure_apst(ctrl); > } > } > > > _______________________________________________ > Linux-nvme mailing list > Linux-nvme@lists.infradead.org > http://lists.infradead.org/mailman/listinfo/linux-nvme ---end quoted text--- _______________________________________________ Linux-nvme mailing list Linux-nvme@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-nvme ^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [PATCH] nvme: fix APST error for power latency tolerance 2021-04-09 7:19 ` Christoph Hellwig @ 2021-04-09 8:12 ` Peng Liu 0 siblings, 0 replies; 6+ messages in thread From: Peng Liu @ 2021-04-09 8:12 UTC (permalink / raw) To: Christoph Hellwig; +Cc: kbusch, linux-nvme, Peng Liu On Fri, Apr 09, 2021 at 08:19:21AM +0100, Christoph Hellwig wrote: > On Wed, Mar 24, 2021 at 07:58:24AM +0000, Christoph Hellwig wrote: > > On Wed, Mar 24, 2021 at 10:38:12AM +0800, Peng Liu wrote: > > > On Tue, Mar 23, 2021 at 04:23:21PM +0000, Christoph Hellwig wrote: > > > > On Tue, Mar 23, 2021 at 03:31:33PM +0800, pngliu@hotmail.com wrote: > > > > > From: Peng Liu <liupeng17@lenovo.com> > > > > > > > > > > Clear apsta so that nvme_configure_apst() does not execute > > > > > nvme_set_features(), which will fail because admin_q is either not set up > > > > > yet or no longer available at the time of nvme_uninit_ctrl() being called, > > > > > and this leads to the error message "nvme nvme0: failed to set APST feature > > > > > (-19)". > > > > > > > > > > Fixes: 510a405d945b("nvme: fix memory leak for power latency tolerance") > > > > > > > > How did you get into this situation? For PCIe nvme_uninit_ctrl is > > > > only called at the end of ->remove and ->delete_ctrl, so how do we end > > > > up in nvme_configure_apst after that? > > > > > > I got into it with nvme surprise and non-surprise hot-removal tests. > > > Below is the stack ftrace result for nvme_configure_apst under the > > > surprise hot-removal, and it is similar for the non-surprise hot-removal. > > > > Ok, looks like dev_pm_qos_hide_latency_tolerance calls back into > > nvme_set_latency_tolerance, which is a little .. unexpected. > > > > Does this patch work for you? > > ping? > Sorry for late reply. The nvme disk with APST feature is not available till now. Yes, it works. I tested it with nvme hot-removal surprise and non-surprise. > > > > diff --git a/drivers/nvme/host/core.c b/drivers/nvme/host/core.c > > index 0896e21642beba..d5d7e0cdd78d80 100644 > > --- a/drivers/nvme/host/core.c > > +++ b/drivers/nvme/host/core.c > > @@ -2681,7 +2681,8 @@ static void nvme_set_latency_tolerance(struct device *dev, s32 val) > > > > if (ctrl->ps_max_latency_us != latency) { > > ctrl->ps_max_latency_us = latency; > > - nvme_configure_apst(ctrl); > > + if (ctrl->state == NVME_CTRL_LIVE) > > + nvme_configure_apst(ctrl); > > } > > } > > > > > > _______________________________________________ > > Linux-nvme mailing list > > Linux-nvme@lists.infradead.org > > http://lists.infradead.org/mailman/listinfo/linux-nvme > ---end quoted text--- _______________________________________________ Linux-nvme mailing list Linux-nvme@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-nvme ^ permalink raw reply [flat|nested] 6+ messages in thread
end of thread, other threads:[~2021-04-09 8:14 UTC | newest] Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2021-03-23 7:31 [PATCH] nvme: fix APST error for power latency tolerance pngliu 2021-03-23 16:23 ` Christoph Hellwig 2021-03-24 2:38 ` Peng Liu 2021-03-24 7:58 ` Christoph Hellwig 2021-04-09 7:19 ` Christoph Hellwig 2021-04-09 8:12 ` Peng Liu
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).