From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Date: Fri, 18 May 2018 08:20:05 +0800 From: Ming Lei To: Keith Busch Cc: Keith Busch , Jens Axboe , Laurence Oberman , Sagi Grimberg , James Smart , "linux-nvme@lists.infradead.org" , "linux-block@vger.kernel.org" , Jianchao Wang , Christoph Hellwig Subject: Re: [PATCH V6 11/11] nvme: pci: support nested EH Message-ID: <20180518001958.GA2742@ming.t460p> References: <20180516040313.13596-1-ming.lei@redhat.com> <20180516040313.13596-12-ming.lei@redhat.com> <20180516141242.GA20119@localhost.localdomain> <20180516231058.GB28727@ming.t460p> <20180517022030.GB21959@localhost.localdomain> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii In-Reply-To: <20180517022030.GB21959@localhost.localdomain> List-ID: On Wed, May 16, 2018 at 08:20:31PM -0600, Keith Busch wrote: > Hi Ming, > > I'm developing the answers in code the issues you raised. It will just > take a moment to complete flushing those out. In the meantime just want > to point out why I think block/011 isn't a real test. > > On Thu, May 17, 2018 at 07:10:59AM +0800, Ming Lei wrote: > > All simulation in block/011 may happen in reality. > > If this test actually simulates reality, then the following one line > patch (plus explanation for why) would be a real "fix" as this is very > successful in passing block/011. :) > > --- > diff --git a/drivers/nvme/host/pci.c b/drivers/nvme/host/pci.c > index 1faa32cd07da..dcc5746304c4 100644 > --- a/drivers/nvme/host/pci.c > +++ b/drivers/nvme/host/pci.c > @@ -2118,6 +2118,12 @@ static int nvme_pci_enable(struct nvme_dev *dev) > > if (pci_enable_device_mem(pdev)) > return result; > + /* > + * blktests block/011 disables the device without the driver knowing. > + * We'll just enable the device twice to get the enable_cnt > 1 > + * so that the test's disabling does absolutely nothing. > + */ > + pci_enable_device_mem(pdev); What I think block/011 is helpful is that it can trigger IO timeout during reset, which can be triggered in reality too. That is one big problem of NVMe driver, IMO. And this patch does fix this issue, together other timeout related races. Thanks, Ming From mboxrd@z Thu Jan 1 00:00:00 1970 From: ming.lei@redhat.com (Ming Lei) Date: Fri, 18 May 2018 08:20:05 +0800 Subject: [PATCH V6 11/11] nvme: pci: support nested EH In-Reply-To: <20180517022030.GB21959@localhost.localdomain> References: <20180516040313.13596-1-ming.lei@redhat.com> <20180516040313.13596-12-ming.lei@redhat.com> <20180516141242.GA20119@localhost.localdomain> <20180516231058.GB28727@ming.t460p> <20180517022030.GB21959@localhost.localdomain> Message-ID: <20180518001958.GA2742@ming.t460p> On Wed, May 16, 2018@08:20:31PM -0600, Keith Busch wrote: > Hi Ming, > > I'm developing the answers in code the issues you raised. It will just > take a moment to complete flushing those out. In the meantime just want > to point out why I think block/011 isn't a real test. > > On Thu, May 17, 2018@07:10:59AM +0800, Ming Lei wrote: > > All simulation in block/011 may happen in reality. > > If this test actually simulates reality, then the following one line > patch (plus explanation for why) would be a real "fix" as this is very > successful in passing block/011. :) > > --- > diff --git a/drivers/nvme/host/pci.c b/drivers/nvme/host/pci.c > index 1faa32cd07da..dcc5746304c4 100644 > --- a/drivers/nvme/host/pci.c > +++ b/drivers/nvme/host/pci.c > @@ -2118,6 +2118,12 @@ static int nvme_pci_enable(struct nvme_dev *dev) > > if (pci_enable_device_mem(pdev)) > return result; > + /* > + * blktests block/011 disables the device without the driver knowing. > + * We'll just enable the device twice to get the enable_cnt > 1 > + * so that the test's disabling does absolutely nothing. > + */ > + pci_enable_device_mem(pdev); What I think block/011 is helpful is that it can trigger IO timeout during reset, which can be triggered in reality too. That is one big problem of NVMe driver, IMO. And this patch does fix this issue, together other timeout related races. Thanks, Ming