All of lore.kernel.org
 help / color / mirror / Atom feed
From: "Verma, Vishal L" <vishal.l.verma@intel.com>
To: "kbusch@kernel.org" <kbusch@kernel.org>
Cc: "hch@lst.de" <hch@lst.de>,
	"Williams, Dan J" <dan.j.williams@intel.com>,
	"Schofield, Alison" <alison.schofield@intel.com>,
	"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>,
	"linux-nvme@lists.infradead.org" <linux-nvme@lists.infradead.org>,
	"Weiny, Ira" <ira.weiny@intel.com>,
	"Widawsky, Ben" <ben.widawsky@intel.com>
Subject: Re: [BISECTED] nvme probe failure with v5.13-rc1
Date: Fri, 21 May 2021 16:50:15 +0000	[thread overview]
Message-ID: <e8ad9de961f3bfcb748733b59c189aa577ffa1fd.camel@intel.com> (raw)
In-Reply-To: <20210521145705.GA29013@redsun51.ssa.fujisawa.hgst.com>

On Fri, 2021-05-21 at 23:57 +0900, Keith Busch wrote:
> On Fri, May 21, 2021 at 05:00:29AM +0000, Verma, Vishal L wrote:
> > Hi,
> > 
> > I ran into this failure to probe an nvme device in an emulator
> > (simics). It looks like there is a ~60 second wait followed by a
> > timeout and a failure to boot (the root device is an nvme disk) with
> > these messages in the log:
> > 
> >    [   67.174010] nvme nvme0: I/O 5 QID 0 timeout, disable controller
> >    [   67.175793] nvme nvme0: Removing after probe failure status: -4
> > 
> > I bisected this to:
> >    5befc7c26e5a ("nvme: implement non-mdts command limits") 
> > 
> > It's not immediately obvious to me what's causing the problem.
> > Reverting the above commit fixes it. It is easily reproducible - I'd be
> > happy to provide more info about the emulated device or test out
> > patches or theories.
> > 
> > It is of course possible that the emulated device is behaving in some
> > non spec-compliant way, in which case I'd appreciate any help figuring
> > out what that is.
> 
> Hi Vishal,
> 
> The patch you bisected to sends only a single Identify command, so it
> sounds like that must be the command that times out. The controller is
> not required to support this specific identify (CNS 0x6), but it is
> required to produce a response. If the identify is unsupported, the
> controller should respond with an appropriate error (Invalid Field In
> Command), but it looks like the controller didn't respond at all.
> 
> So based on your observation, it sounds like the simics implementation
> has an identify bug. The spec doesn't provide a way for the driver to
> know ahead of time whether or not this identification is supported, so
> the driver just has to try it and react to the status code. If the
> implmenetation can't be fixed, then we'll need to quirk your device.
> 
> If you want to confirm for certain that the new identify is the source
> of your timeout, you could try the following patch and the timeout
> should go away:
> 
> ---
> diff --git a/drivers/nvme/host/core.c b/drivers/nvme/host/core.c
> index 1a73eed61eee..b16d31d82606 100644
> --- a/drivers/nvme/host/core.c
> +++ b/drivers/nvme/host/core.c
> @@ -2711,7 +2711,7 @@ static int nvme_init_non_mdts_limits(struct nvme_ctrl *ctrl)
>  	else
>  		ctrl->max_zeroes_sectors = 0;
> 
> -	if (nvme_ctrl_limited_cns(ctrl))
> +	if (true || nvme_ctrl_limited_cns(ctrl))
>  		return 0;
> 
>  	id = kzalloc(sizeof(*id), GFP_KERNEL);
> --

Hi Keith,

Thanks for looking into it - yes with that the problem goes away.
Let me chat with the simics folks and see if I can get them to fix it.


WARNING: multiple messages have this Message-ID (diff)
From: "Verma, Vishal L" <vishal.l.verma@intel.com>
To: "kbusch@kernel.org" <kbusch@kernel.org>
Cc: "hch@lst.de" <hch@lst.de>,
	"Williams, Dan J" <dan.j.williams@intel.com>,
	"Schofield, Alison" <alison.schofield@intel.com>,
	"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>,
	"linux-nvme@lists.infradead.org" <linux-nvme@lists.infradead.org>,
	"Weiny, Ira" <ira.weiny@intel.com>,
	"Widawsky, Ben" <ben.widawsky@intel.com>
Subject: Re: [BISECTED] nvme probe failure with v5.13-rc1
Date: Fri, 21 May 2021 16:50:15 +0000	[thread overview]
Message-ID: <e8ad9de961f3bfcb748733b59c189aa577ffa1fd.camel@intel.com> (raw)
In-Reply-To: <20210521145705.GA29013@redsun51.ssa.fujisawa.hgst.com>

On Fri, 2021-05-21 at 23:57 +0900, Keith Busch wrote:
> On Fri, May 21, 2021 at 05:00:29AM +0000, Verma, Vishal L wrote:
> > Hi,
> > 
> > I ran into this failure to probe an nvme device in an emulator
> > (simics). It looks like there is a ~60 second wait followed by a
> > timeout and a failure to boot (the root device is an nvme disk) with
> > these messages in the log:
> > 
> >    [   67.174010] nvme nvme0: I/O 5 QID 0 timeout, disable controller
> >    [   67.175793] nvme nvme0: Removing after probe failure status: -4
> > 
> > I bisected this to:
> >    5befc7c26e5a ("nvme: implement non-mdts command limits") 
> > 
> > It's not immediately obvious to me what's causing the problem.
> > Reverting the above commit fixes it. It is easily reproducible - I'd be
> > happy to provide more info about the emulated device or test out
> > patches or theories.
> > 
> > It is of course possible that the emulated device is behaving in some
> > non spec-compliant way, in which case I'd appreciate any help figuring
> > out what that is.
> 
> Hi Vishal,
> 
> The patch you bisected to sends only a single Identify command, so it
> sounds like that must be the command that times out. The controller is
> not required to support this specific identify (CNS 0x6), but it is
> required to produce a response. If the identify is unsupported, the
> controller should respond with an appropriate error (Invalid Field In
> Command), but it looks like the controller didn't respond at all.
> 
> So based on your observation, it sounds like the simics implementation
> has an identify bug. The spec doesn't provide a way for the driver to
> know ahead of time whether or not this identification is supported, so
> the driver just has to try it and react to the status code. If the
> implmenetation can't be fixed, then we'll need to quirk your device.
> 
> If you want to confirm for certain that the new identify is the source
> of your timeout, you could try the following patch and the timeout
> should go away:
> 
> ---
> diff --git a/drivers/nvme/host/core.c b/drivers/nvme/host/core.c
> index 1a73eed61eee..b16d31d82606 100644
> --- a/drivers/nvme/host/core.c
> +++ b/drivers/nvme/host/core.c
> @@ -2711,7 +2711,7 @@ static int nvme_init_non_mdts_limits(struct nvme_ctrl *ctrl)
>  	else
>  		ctrl->max_zeroes_sectors = 0;
> 
> -	if (nvme_ctrl_limited_cns(ctrl))
> +	if (true || nvme_ctrl_limited_cns(ctrl))
>  		return 0;
> 
>  	id = kzalloc(sizeof(*id), GFP_KERNEL);
> --

Hi Keith,

Thanks for looking into it - yes with that the problem goes away.
Let me chat with the simics folks and see if I can get them to fix it.

_______________________________________________
Linux-nvme mailing list
Linux-nvme@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-nvme

  reply	other threads:[~2021-05-21 16:51 UTC|newest]

Thread overview: 8+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-05-21  5:00 [BISECTED] nvme probe failure with v5.13-rc1 Verma, Vishal L
2021-05-21  5:00 ` Verma, Vishal L
2021-05-21 14:57 ` Keith Busch
2021-05-21 14:57   ` Keith Busch
2021-05-21 16:50   ` Verma, Vishal L [this message]
2021-05-21 16:50     ` Verma, Vishal L
2021-05-21 17:31     ` Keith Busch
2021-05-21 17:31       ` Keith Busch

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=e8ad9de961f3bfcb748733b59c189aa577ffa1fd.camel@intel.com \
    --to=vishal.l.verma@intel.com \
    --cc=alison.schofield@intel.com \
    --cc=ben.widawsky@intel.com \
    --cc=dan.j.williams@intel.com \
    --cc=hch@lst.de \
    --cc=ira.weiny@intel.com \
    --cc=kbusch@kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-nvme@lists.infradead.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.