Re: [PATCHv2 2/2] nvme/pci: Mask device interrupts for threaded handlers

From: Keith Busch <kbusch@kernel.org>
To: Christoph Hellwig <hch@lst.de>
Cc: sagi@grimberg.me, bigeasy@linutronix.de,
	linux-nvme@lists.infradead.org, ming.lei@redhat.com,
	helgaas@kernel.org, tglx@linutronix.de
Subject: Re: [PATCHv2 2/2] nvme/pci: Mask device interrupts for threaded handlers
Date: Tue, 3 Dec 2019 05:07:05 -0700	[thread overview]
Message-ID: <20191203120705.GC86476@C02WT3WMHTD6.lpcnextlight.net> (raw)
In-Reply-To: <20191203074723.GE23881@lst.de>

On Tue, Dec 03, 2019 at 08:47:23AM +0100, Christoph Hellwig wrote:
> On Tue, Dec 03, 2019 at 07:20:58AM +0900, Keith Busch wrote:
> > +static irqreturn_t nvme_irq_thread(int irq, void *data)
> > +{
> > +	struct nvme_queue *nvmeq = data;
> > +
> > +	nvme_irq(irq, data);
> > +	if (to_pci_dev(nvmeq->dev->dev)->msix_enabled)
> > +		__pci_msix_desc_mask_irq(irq_get_msi_desc(irq), 0);
> > +	else
> > +		writel(1 << nvmeq->cq_vector, nvmeq->dev->bar + NVME_REG_INTMC);
> 
> So independent of the indirection issue can we have a theory of operation
> on why not using a read to flush the posted writes to disable/enable the
> interrupt (either variant) here is fine?  Let's assume we have a worse
> case implementation where no write ever gets delivered to the device
> until we do a read neither of them will ever hit the device as we don't
> really do MMIO reads in nvme during normal operation.

Sure.

First, if the masking MMIO write is stalled, the read back doesn't
make it complete any faster. It just means the flush takes that
much longer.

Also note that the write can't be reordered with other writes to
that device. You'd have to use writel_relaxed() for that.

We notify the nvme controller of new commands using doorbell writes.
If all writes are stalled, then the device doesn't have new commands
to complete, so there won't be any interrupts for the driver to
handle.

If we want to craft a worst case scenario where the device was aware
of the maximum possible commands on a queue (1023 for this driver),
we could theoretically observe that many MSIs for exactly 1 IRQ_HANDLED
return. This is below the 0.01% threshold for "nobody cared".
Subsequent new commands would be stuck behind the MSI mask write,
so there can't possibly be new IO while that mask write is
stuck.

So, I think if the device never sees the MSI masking while the
thread is reaping the completion queue, we'll just spend CPU time
in an unnecessary irq handler that would have otherwise been consumed
on a flushing readl().

_______________________________________________
linux-nvme mailing list
linux-nvme@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-nvme