Linux-NVME Archive on lore.kernel.org
 help / color / Atom feed
From: "Ewan D. Milne" <emilne@redhat.com>
To: linux-nvme@lists.infradead.org
Subject: Re: [PATCH] nvme-fc: fix double-free scenarios on hw queues
Date: Thu, 21 Nov 2019 14:31:40 -0500
Message-ID: <aa5ab9ccface65cba94340384f5f790b0944de2e.camel@redhat.com> (raw)
In-Reply-To: <20191121175937.19615-1-jsmart2021@gmail.com>

On Thu, 2019-11-21 at 09:59 -0800, James Smart wrote:
> If an error occurs on one of the ios used for creating an
> association, the creating routine has error paths that are
> invoked by the command failure and the error paths will free
> up the controller resources created to that point.
> 
> But... the io was ultimately determined by an asynchronous
> completion routine that detected the error and which
> unconditionally invokes the error_recovery path which calls
> delete_association. Delete association deletes all outstanding
> io then tears down the controller resources. So the
> create_association thread can be running in parallel with
> the error_recovery thread. What was seen was the LLDD received
> a call to delete a queue, causing the LLDD to do a free of a
> resource, then the transport called the delete queue again
> causing the driver to repeat the free call. The second free
> routine corrupted the allocator. The transport shouldn't be
> making the duplicate call, and the delete queue is just one
> of the resources being freed.
> 
> To fix, it is realized that the create_association path is
> completely serialized with one command at a time. So the
> failed io completion will always be seen by the create_association
> path and as of the failure, there are no ios to terminate and there
> is no reason to be manipulating queue freeze states, etc.
> The serialized condition stays true until the controller is
> transitioned to the LIVE state. Thus the fix is to change the
> error recovery path to check the controller state and only
> invoke the teardown path if not already in the CONNECTING state.
> 
> Signed-off-by: James Smart <jsmart2021@gmail.com>
> ---
>  drivers/nvme/host/fc.c | 18 +++++++++++++++---
>  1 file changed, 15 insertions(+), 3 deletions(-)
> 
> diff --git a/drivers/nvme/host/fc.c b/drivers/nvme/host/fc.c
> index 679a721ae229..2acb850bf9f4 100644
> --- a/drivers/nvme/host/fc.c
> +++ b/drivers/nvme/host/fc.c
> @@ -2910,10 +2910,22 @@ nvme_fc_reconnect_or_delete(struct nvme_fc_ctrl *ctrl, int status)
>  static void
>  __nvme_fc_terminate_io(struct nvme_fc_ctrl *ctrl)
>  {
> -	nvme_stop_keep_alive(&ctrl->ctrl);
> +	/*
> +	 * if state is connecting - the error occurred as part of a
> +	 * reconnect attempt. The create_association error paths will
> +	 * clean up any outstanding io.
> +	 *
> +	 * if it's a different state - ensure all pending io is
> +	 * terminated. Given this can delay while waiting for the
> +	 * aborted io to return, we recheck adapter state below
> +	 * before changing state.
> +	 */
> +	if (ctrl->ctrl.state != NVME_CTRL_CONNECTING) {
> +		nvme_stop_keep_alive(&ctrl->ctrl);
>  
> -	/* will block will waiting for io to terminate */
> -	nvme_fc_delete_association(ctrl);
> +		/* will block will waiting for io to terminate */
> +		nvme_fc_delete_association(ctrl);
> +	}
>  
>  	if (ctrl->ctrl.state != NVME_CTRL_CONNECTING &&
>  	    !nvme_change_ctrl_state(&ctrl->ctrl, NVME_CTRL_CONNECTING))

Reviewed-by: Ewan D. Milne <emilne@redhat.com>


_______________________________________________
Linux-nvme mailing list
Linux-nvme@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-nvme

  reply index

Thread overview: 4+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-11-21 17:59 James Smart
2019-11-21 19:31 ` Ewan D. Milne [this message]
2019-11-22 17:01 ` Himanshu Madhani
2019-11-26 18:01 ` Keith Busch

Reply instructions:

You may reply publically to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=aa5ab9ccface65cba94340384f5f790b0944de2e.camel@redhat.com \
    --to=emilne@redhat.com \
    --cc=linux-nvme@lists.infradead.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Linux-NVME Archive on lore.kernel.org

Archives are clonable:
	git clone --mirror https://lore.kernel.org/linux-nvme/0 linux-nvme/git/0.git

	# If you have public-inbox 1.1+ installed, you may
	# initialize and index your mirror using the following commands:
	public-inbox-init -V2 linux-nvme linux-nvme/ https://lore.kernel.org/linux-nvme \
		linux-nvme@lists.infradead.org
	public-inbox-index linux-nvme

Example config snippet for mirrors

Newsgroup available over NNTP:
	nntp://nntp.lore.kernel.org/org.infradead.lists.linux-nvme


AGPL code for this site: git clone https://public-inbox.org/public-inbox.git