All of lore.kernel.org
 help / color / mirror / Atom feed
From: "jianchao.wang" <jianchao.w.wang@oracle.com>
To: Keith Busch <keith.busch@intel.com>
Cc: axboe@fb.com, hch@lst.de, sagi@grimberg.me, maxg@mellanox.com,
	james.smart@broadcom.com, linux-nvme@lists.infradead.org,
	linux-kernel@vger.kernel.org
Subject: Re: [PATCH V5 0/2] nvme-pci: fix the timeout case when reset is ongoing
Date: Fri, 19 Jan 2018 17:02:06 +0800	[thread overview]
Message-ID: <84b4e3bc-fe23-607e-9d5e-bb5644eedb54@oracle.com> (raw)
In-Reply-To: <20180119084218.GF12043@localhost.localdomain>

Hi Keith

Thanks for your kindly and detailed response and patch.

On 01/19/2018 04:42 PM, Keith Busch wrote:
> On Fri, Jan 19, 2018 at 04:14:02PM +0800, jianchao.wang wrote:
>> On 01/19/2018 04:01 PM, Keith Busch wrote:
>>> The nvme_dev_disable routine makes forward progress without depending on
>>> timeout handling to complete expired commands. Once controller disabling
>>> completes, there can't possibly be any started requests that can expire.
>>> So we don't need nvme_timeout to do anything for requests above the
>>> boundary.
>>>
>> Yes, once controller disabling completes, any started requests will be handled and cannot expire.
>> But before the _boundary_, there could be a nvme_timeout context runs with nvme_dev_disable in parallel.
>> If a timeout path grabs a request, then nvme_dev_disable cannot get and cancel it.
>> So even though the nvme_dev_disable completes, there still could be a request in nvme_timeout context.
>>
>> The worst case is :
>> nvme_timeout                              nvme_reset_work
>> if (ctrl->state == RESETTING )              nvme_dev_disable
>>     nvme_dev_disable                        initializing procedure
>>
>> the nvme_dev_disable run with reinit procedure in nvme_reset_work in parallel.
> 
> Okay, I see what you're saying. That's a pretty obscure case, as normally
> we enter nvme_reset_work with the queues already disabled, but there
> are a few cases where we need nvme_reset_work to handle that.
> 
> But if we are in that case, I think we really just want to sync the
> queues. What do you think of this?
> 
> ---
> diff --git a/drivers/nvme/host/core.c b/drivers/nvme/host/core.c
> index fde6fd2e7eef..516383193416 100644
> --- a/drivers/nvme/host/core.c
> +++ b/drivers/nvme/host/core.c
> @@ -3520,6 +3520,17 @@ void nvme_stop_queues(struct nvme_ctrl *ctrl)
>  }
>  EXPORT_SYMBOL_GPL(nvme_stop_queues);
>  
> +void nvme_sync_queues(struct nvme_ctrl *ctrl)
> +{
> +	struct nvme_ns *ns;
> +
> +	mutex_lock(&ctrl->namespaces_mutex);
> +	list_for_each_entry(ns, &ctrl->namespaces, list)
> +		blk_sync_queue(ns->queue);
> +	mutex_unlock(&ctrl->namespaces_mutex);
> +}
> +EXPORT_SYMBOL_GPL(nvme_sync_queues);
> +
>  void nvme_start_queues(struct nvme_ctrl *ctrl)
>  {
>  	struct nvme_ns *ns;
> diff --git a/drivers/nvme/host/nvme.h b/drivers/nvme/host/nvme.h
> index 8e7fc1b041b7..45b1b8ceddb6 100644
> --- a/drivers/nvme/host/nvme.h
> +++ b/drivers/nvme/host/nvme.h
> @@ -374,6 +374,7 @@ void nvme_complete_async_event(struct nvme_ctrl *ctrl, __le16 status,
>  
>  void nvme_stop_queues(struct nvme_ctrl *ctrl);
>  void nvme_start_queues(struct nvme_ctrl *ctrl);
> +void nvme_sync_queues(struct nvme_ctrl *ctrl)
>  void nvme_kill_queues(struct nvme_ctrl *ctrl);
>  void nvme_unfreeze(struct nvme_ctrl *ctrl);
>  void nvme_wait_freeze(struct nvme_ctrl *ctrl);
> diff --git a/drivers/nvme/host/pci.c b/drivers/nvme/host/pci.c
> index a2ffb557b616..1fe00be22ad1 100644
> --- a/drivers/nvme/host/pci.c
> +++ b/drivers/nvme/host/pci.c
> @@ -2289,8 +2289,10 @@ static void nvme_reset_work(struct work_struct *work)
>  	 * If we're called to reset a live controller first shut it down before
>  	 * moving on.
>  	 */
> -	if (dev->ctrl.ctrl_config & NVME_CC_ENABLE)
> +	if (dev->ctrl.ctrl_config & NVME_CC_ENABLE) {
>  		nvme_dev_disable(dev, false);
> +		nvme_sync_queues(&dev->ctrl);
> +	}
>  
>  	result = nvme_pci_enable(dev);
>  	if (result)
> --
> 

We should not use blk_sync_queue here, the requeue_work and run_work will be canceled.
Just flush_work(&q->timeout_work) should be ok.

In addition, we could check NVME_CC_ENABLE in nvme_dev_disable to avoid redundant invoking.
:)

Thanks
Jianchao

WARNING: multiple messages have this Message-ID (diff)
From: jianchao.w.wang@oracle.com (jianchao.wang)
Subject: [PATCH V5 0/2] nvme-pci: fix the timeout case when reset is ongoing
Date: Fri, 19 Jan 2018 17:02:06 +0800	[thread overview]
Message-ID: <84b4e3bc-fe23-607e-9d5e-bb5644eedb54@oracle.com> (raw)
In-Reply-To: <20180119084218.GF12043@localhost.localdomain>

Hi Keith

Thanks for your kindly and detailed response and patch.

On 01/19/2018 04:42 PM, Keith Busch wrote:
> On Fri, Jan 19, 2018@04:14:02PM +0800, jianchao.wang wrote:
>> On 01/19/2018 04:01 PM, Keith Busch wrote:
>>> The nvme_dev_disable routine makes forward progress without depending on
>>> timeout handling to complete expired commands. Once controller disabling
>>> completes, there can't possibly be any started requests that can expire.
>>> So we don't need nvme_timeout to do anything for requests above the
>>> boundary.
>>>
>> Yes, once controller disabling completes, any started requests will be handled and cannot expire.
>> But before the _boundary_, there could be a nvme_timeout context runs with nvme_dev_disable in parallel.
>> If a timeout path grabs a request, then nvme_dev_disable cannot get and cancel it.
>> So even though the nvme_dev_disable completes, there still could be a request in nvme_timeout context.
>>
>> The worst case is :
>> nvme_timeout                              nvme_reset_work
>> if (ctrl->state == RESETTING )              nvme_dev_disable
>>     nvme_dev_disable                        initializing procedure
>>
>> the nvme_dev_disable run with reinit procedure in nvme_reset_work in parallel.
> 
> Okay, I see what you're saying. That's a pretty obscure case, as normally
> we enter nvme_reset_work with the queues already disabled, but there
> are a few cases where we need nvme_reset_work to handle that.
> 
> But if we are in that case, I think we really just want to sync the
> queues. What do you think of this?
> 
> ---
> diff --git a/drivers/nvme/host/core.c b/drivers/nvme/host/core.c
> index fde6fd2e7eef..516383193416 100644
> --- a/drivers/nvme/host/core.c
> +++ b/drivers/nvme/host/core.c
> @@ -3520,6 +3520,17 @@ void nvme_stop_queues(struct nvme_ctrl *ctrl)
>  }
>  EXPORT_SYMBOL_GPL(nvme_stop_queues);
>  
> +void nvme_sync_queues(struct nvme_ctrl *ctrl)
> +{
> +	struct nvme_ns *ns;
> +
> +	mutex_lock(&ctrl->namespaces_mutex);
> +	list_for_each_entry(ns, &ctrl->namespaces, list)
> +		blk_sync_queue(ns->queue);
> +	mutex_unlock(&ctrl->namespaces_mutex);
> +}
> +EXPORT_SYMBOL_GPL(nvme_sync_queues);
> +
>  void nvme_start_queues(struct nvme_ctrl *ctrl)
>  {
>  	struct nvme_ns *ns;
> diff --git a/drivers/nvme/host/nvme.h b/drivers/nvme/host/nvme.h
> index 8e7fc1b041b7..45b1b8ceddb6 100644
> --- a/drivers/nvme/host/nvme.h
> +++ b/drivers/nvme/host/nvme.h
> @@ -374,6 +374,7 @@ void nvme_complete_async_event(struct nvme_ctrl *ctrl, __le16 status,
>  
>  void nvme_stop_queues(struct nvme_ctrl *ctrl);
>  void nvme_start_queues(struct nvme_ctrl *ctrl);
> +void nvme_sync_queues(struct nvme_ctrl *ctrl)
>  void nvme_kill_queues(struct nvme_ctrl *ctrl);
>  void nvme_unfreeze(struct nvme_ctrl *ctrl);
>  void nvme_wait_freeze(struct nvme_ctrl *ctrl);
> diff --git a/drivers/nvme/host/pci.c b/drivers/nvme/host/pci.c
> index a2ffb557b616..1fe00be22ad1 100644
> --- a/drivers/nvme/host/pci.c
> +++ b/drivers/nvme/host/pci.c
> @@ -2289,8 +2289,10 @@ static void nvme_reset_work(struct work_struct *work)
>  	 * If we're called to reset a live controller first shut it down before
>  	 * moving on.
>  	 */
> -	if (dev->ctrl.ctrl_config & NVME_CC_ENABLE)
> +	if (dev->ctrl.ctrl_config & NVME_CC_ENABLE) {
>  		nvme_dev_disable(dev, false);
> +		nvme_sync_queues(&dev->ctrl);
> +	}
>  
>  	result = nvme_pci_enable(dev);
>  	if (result)
> --
> 

We should not use blk_sync_queue here, the requeue_work and run_work will be canceled.
Just flush_work(&q->timeout_work) should be ok.

In addition, we could check NVME_CC_ENABLE in nvme_dev_disable to avoid redundant invoking.
:)

Thanks
Jianchao

  reply	other threads:[~2018-01-19  9:03 UTC|newest]

Thread overview: 40+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-01-18 10:10 [PATCH V5 0/2] nvme-pci: fix the timeout case when reset is ongoing Jianchao Wang
2018-01-18 10:10 ` Jianchao Wang
2018-01-18 10:10 ` [PATCH V5 1/2] nvme-pci: introduce RECONNECTING state to mark initializing procedure Jianchao Wang
2018-01-18 10:10   ` Jianchao Wang
2018-01-18 10:17   ` Max Gurtovoy
2018-01-18 10:17     ` Max Gurtovoy
2018-01-19  9:49     ` jianchao.wang
2018-01-19  9:49       ` jianchao.wang
2018-01-18 15:23   ` James Smart
2018-01-18 15:23     ` James Smart
2018-01-18 10:10 ` [PATCH V5 2/2] nvme-pci: fixup the timeout case when reset is ongoing Jianchao Wang
2018-01-18 10:10   ` Jianchao Wang
2018-01-19  4:59   ` Keith Busch
2018-01-19  4:59     ` Keith Busch
2018-01-19  5:55     ` jianchao.wang
2018-01-19  5:55       ` jianchao.wang
2018-01-19  6:05       ` Keith Busch
2018-01-19  6:05         ` Keith Busch
2018-01-19  6:53         ` jianchao.wang
2018-01-19  6:53           ` jianchao.wang
2018-01-18 15:34 ` [PATCH V5 0/2] nvme-pci: fix " James Smart
2018-01-18 15:34   ` James Smart
2018-01-19  8:01 ` Keith Busch
2018-01-19  8:01   ` Keith Busch
2018-01-19  8:14   ` jianchao.wang
2018-01-19  8:14     ` jianchao.wang
2018-01-19  8:42     ` Keith Busch
2018-01-19  8:42       ` Keith Busch
2018-01-19  9:02       ` jianchao.wang [this message]
2018-01-19  9:02         ` jianchao.wang
2018-01-19 11:52         ` Keith Busch
2018-01-19 11:52           ` Keith Busch
2018-01-19 13:56           ` jianchao.wang
2018-01-19 13:56             ` jianchao.wang
2018-01-20  2:11             ` Keith Busch
2018-01-20  2:11               ` Keith Busch
2018-01-20 14:07               ` jianchao.wang
2018-01-20 14:07                 ` jianchao.wang
2018-01-20 14:14                 ` jianchao.wang
2018-01-20 14:14                   ` jianchao.wang

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=84b4e3bc-fe23-607e-9d5e-bb5644eedb54@oracle.com \
    --to=jianchao.w.wang@oracle.com \
    --cc=axboe@fb.com \
    --cc=hch@lst.de \
    --cc=james.smart@broadcom.com \
    --cc=keith.busch@intel.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-nvme@lists.infradead.org \
    --cc=maxg@mellanox.com \
    --cc=sagi@grimberg.me \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.