All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH] nvme-pci: Add Write Zero support to Intel 660p
@ 2018-10-26 22:41 Gwendal Grignou
  2018-10-26 22:58 ` Keith Busch
  0 siblings, 1 reply; 10+ messages in thread
From: Gwendal Grignou @ 2018-10-26 22:41 UTC (permalink / raw)


Given Intel 660p returns 0 when reading discarded data,
allow Write Zero.

nvme id-ns /dev/nvme0n1
...
dlfeat  : 1
...

Check Write Zeros is now enabled:
cat /sys/block/nvme0n1/queue/write_zeroes_max_bytes
2199023255040

Signed-off-by: Gwendal Grignou <gwendal at chromium.org>
---
 drivers/nvme/host/pci.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/drivers/nvme/host/pci.c b/drivers/nvme/host/pci.c
index 7e09e45b0b28d..b81ad51d78b72 100644
--- a/drivers/nvme/host/pci.c
+++ b/drivers/nvme/host/pci.c
@@ -2694,6 +2694,8 @@ static const struct pci_device_id nvme_id_table[] = {
 	{ PCI_VDEVICE(INTEL, 0xf1a5),	/* Intel 600P/P3100 */
 		.driver_data = NVME_QUIRK_NO_DEEPEST_PS |
 				NVME_QUIRK_MEDIUM_PRIO_SQ },
+	{ PCI_VDEVICE(INTEL, 0xf1a8), /* Intel 660P */
+		.driver_data = NVME_QUIRK_DEALLOCATE_ZEROES, },
 	{ PCI_VDEVICE(INTEL, 0x5845),	/* Qemu emulated controller */
 		.driver_data = NVME_QUIRK_IDENTIFY_CNS, },
 	{ PCI_DEVICE(0x1bb1, 0x0100),   /* Seagate Nytro Flash Storage */
-- 
2.19.1.568.g152ad8e336-goog

^ permalink raw reply related	[flat|nested] 10+ messages in thread

* [PATCH] nvme-pci: Add Write Zero support to Intel 660p
  2018-10-26 22:41 [PATCH] nvme-pci: Add Write Zero support to Intel 660p Gwendal Grignou
@ 2018-10-26 22:58 ` Keith Busch
  2018-10-27  7:44   ` Christoph Hellwig
  0 siblings, 1 reply; 10+ messages in thread
From: Keith Busch @ 2018-10-26 22:58 UTC (permalink / raw)


On Fri, Oct 26, 2018@03:41:46PM -0700, Gwendal Grignou wrote:
> Given Intel 660p returns 0 when reading discarded data,
> allow Write Zero.
> 
> nvme id-ns /dev/nvme0n1
> ...
> dlfeat  : 1
> ...
> 
> Check Write Zeros is now enabled:
> cat /sys/block/nvme0n1/queue/write_zeroes_max_bytes
> 2199023255040
> 
> Signed-off-by: Gwendal Grignou <gwendal at chromium.org>
> ---
>  drivers/nvme/host/pci.c | 2 ++
>  1 file changed, 2 insertions(+)
> 
> diff --git a/drivers/nvme/host/pci.c b/drivers/nvme/host/pci.c
> index 7e09e45b0b28d..b81ad51d78b72 100644
> --- a/drivers/nvme/host/pci.c
> +++ b/drivers/nvme/host/pci.c
> @@ -2694,6 +2694,8 @@ static const struct pci_device_id nvme_id_table[] = {
>  	{ PCI_VDEVICE(INTEL, 0xf1a5),	/* Intel 600P/P3100 */
>  		.driver_data = NVME_QUIRK_NO_DEEPEST_PS |
>  				NVME_QUIRK_MEDIUM_PRIO_SQ },
> +	{ PCI_VDEVICE(INTEL, 0xf1a8), /* Intel 660P */
> +		.driver_data = NVME_QUIRK_DEALLOCATE_ZEROES, },

The quirk is for devices that pre-date the ability to discover the
capability. Now that we have a spec defined way to know a device's
deallocate behavior, and this particular device implements that method,
we should use that instead of a quirk.

^ permalink raw reply	[flat|nested] 10+ messages in thread

* [PATCH] nvme-pci: Add Write Zero support to Intel 660p
  2018-10-26 22:58 ` Keith Busch
@ 2018-10-27  7:44   ` Christoph Hellwig
  2018-10-27  9:19     ` Chaitanya Kulkarni
  2018-10-29 15:24     ` Keith Busch
  0 siblings, 2 replies; 10+ messages in thread
From: Christoph Hellwig @ 2018-10-27  7:44 UTC (permalink / raw)


On Fri, Oct 26, 2018@04:58:07PM -0600, Keith Busch wrote:
> The quirk is for devices that pre-date the ability to discover the
> capability. Now that we have a spec defined way to know a device's
> deallocate behavior, and this particular device implements that method,
> we should use that instead of a quirk.

The 'Deallocate Logical Block Features' bits 0 and 1 only tell what
is read back from deallocated logic blocks, but it does not guarantee
that all blocks a DSM deallocate is called on are actually deallocated.

That being said I really don't want to add more 'feature hacks' like this,
it is time that we go back to a proper Write Zeroes implementation.

^ permalink raw reply	[flat|nested] 10+ messages in thread

* [PATCH] nvme-pci: Add Write Zero support to Intel 660p
  2018-10-27  7:44   ` Christoph Hellwig
@ 2018-10-27  9:19     ` Chaitanya Kulkarni
  2018-10-29 15:24     ` Keith Busch
  1 sibling, 0 replies; 10+ messages in thread
From: Chaitanya Kulkarni @ 2018-10-27  9:19 UTC (permalink / raw)




From: Linux-nvme <linux-nvme-bounces@lists.infradead.org> on behalf of Christoph Hellwig <hch@lst.de>
Sent: Saturday, October 27, 2018 12:44 AM
To: Keith Busch
Cc: axboe at fb.com; Gwendal Grignou; hch at lst.de; linux-nvme at lists.infradead.org; sagi at grimberg.me
Subject: Re: [PATCH] nvme-pci: Add Write Zero support to Intel 660p
? 
 
On Fri, Oct 26, 2018@04:58:07PM -0600, Keith Busch wrote:
> The quirk is for devices that pre-date the ability to discover the
> capability. Now that we have a spec defined way to know a device's
> deallocate behavior, and this particular device implements that method,
> we should use that instead of a quirk.

The 'Deallocate Logical Block Features' bits 0 and 1 only tell what
is read back from deallocated logic blocks, but it does not guarantee
that all blocks a DSM deallocate is called on are actually deallocated.

That being said I really don't want to add more 'feature hacks' like this,
it is time that we go back to a proper Write Zeroes implementation.

[CK] I'll send out the patch shortly.

_______________________________________________
Linux-nvme mailing list
Linux-nvme at lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-nvme
    

^ permalink raw reply	[flat|nested] 10+ messages in thread

* [PATCH] nvme-pci: Add Write Zero support to Intel 660p
  2018-10-27  7:44   ` Christoph Hellwig
  2018-10-27  9:19     ` Chaitanya Kulkarni
@ 2018-10-29 15:24     ` Keith Busch
  2018-11-01  5:27       ` Christoph Hellwig
  1 sibling, 1 reply; 10+ messages in thread
From: Keith Busch @ 2018-10-29 15:24 UTC (permalink / raw)


On Sat, Oct 27, 2018@09:44:52AM +0200, Christoph Hellwig wrote:
> On Fri, Oct 26, 2018@04:58:07PM -0600, Keith Busch wrote:
> > The quirk is for devices that pre-date the ability to discover the
> > capability. Now that we have a spec defined way to know a device's
> > deallocate behavior, and this particular device implements that method,
> > we should use that instead of a quirk.
> 
> The 'Deallocate Logical Block Features' bits 0 and 1 only tell what
> is read back from deallocated logic blocks, but it does not guarantee
> that all blocks a DSM deallocate is called on are actually deallocated.

Sure, but whether the requested blocks are deallocated or not is
irrelevant. We're only using this feature here for its deterministic
read behavior.

> That being said I really don't want to add more 'feature hacks' like this,
> it is time that we go back to a proper Write Zeroes implementation.

Yes, I agree we ought to use nvme Write Zeroes instead of DSM if
the controller supports it, but recall we had to revert the original
implementation because it broke Linus' machine. The nvme part looked
fine to me, though, so I don't know what the problem may have been.

^ permalink raw reply	[flat|nested] 10+ messages in thread

* [PATCH] nvme-pci: Add Write Zero support to Intel 660p
  2018-10-29 15:24     ` Keith Busch
@ 2018-11-01  5:27       ` Christoph Hellwig
  2018-11-01 14:38         ` Keith Busch
  0 siblings, 1 reply; 10+ messages in thread
From: Christoph Hellwig @ 2018-11-01  5:27 UTC (permalink / raw)


On Mon, Oct 29, 2018@09:24:01AM -0600, Keith Busch wrote:
> > The 'Deallocate Logical Block Features' bits 0 and 1 only tell what
> > is read back from deallocated logic blocks, but it does not guarantee
> > that all blocks a DSM deallocate is called on are actually deallocated.
> 
> Sure, but whether the requested blocks are deallocated or not is
> irrelevant. We're only using this feature here for its deterministic
> read behavior.

I can't parse this.

^ permalink raw reply	[flat|nested] 10+ messages in thread

* [PATCH] nvme-pci: Add Write Zero support to Intel 660p
  2018-11-01  5:27       ` Christoph Hellwig
@ 2018-11-01 14:38         ` Keith Busch
  2018-11-02  6:08           ` Christoph Hellwig
  0 siblings, 1 reply; 10+ messages in thread
From: Keith Busch @ 2018-11-01 14:38 UTC (permalink / raw)


On Thu, Nov 01, 2018@06:27:49AM +0100, Christoph Hellwig wrote:
> On Mon, Oct 29, 2018@09:24:01AM -0600, Keith Busch wrote:
> > > The 'Deallocate Logical Block Features' bits 0 and 1 only tell what
> > > is read back from deallocated logic blocks, but it does not guarantee
> > > that all blocks a DSM deallocate is called on are actually deallocated.
> > 
> > Sure, but whether the requested blocks are deallocated or not is
> > irrelevant. We're only using this feature here for its deterministic
> > read behavior.
> 
> I can't parse this.

We currently use DSM Deallocate for REQ_OP_WRITE_ZEROES. We don't care
if the blocks are actually deallocated. We only care if reading them
back determinisitically returns 0's, and DLFEAT tells us if we can rely
on that behavior. I'm just suggesting to use that instead of using
the NVME_QUIRK_DEALLOCATE_ZEROES quirk.

^ permalink raw reply	[flat|nested] 10+ messages in thread

* [PATCH] nvme-pci: Add Write Zero support to Intel 660p
  2018-11-01 14:38         ` Keith Busch
@ 2018-11-02  6:08           ` Christoph Hellwig
  2018-11-02 14:21             ` Keith Busch
  0 siblings, 1 reply; 10+ messages in thread
From: Christoph Hellwig @ 2018-11-02  6:08 UTC (permalink / raw)


On Thu, Nov 01, 2018@08:38:08AM -0600, Keith Busch wrote:
> We currently use DSM Deallocate for REQ_OP_WRITE_ZEROES. We don't care
> if the blocks are actually deallocated. We only care if reading them
> back determinisitically returns 0's, and DLFEAT tells us if we can rely
> on that behavior.

No, DLFEAT _only_ tells you the value you read from actually deallocated
blocks.  From the DLFEAT description:

"Bits 2:0 indicate the values read from a deallocated logical block and its
metadata (excluding protection information). The values for this field have
the following meanings:"

And from Section 6.7:

"This command is advisory; a compliant controller may choose to take no
action based on information provided."

"Attribute ? Deallocate (AD): If set to ?1?, then the NVM subsystem may
deallocate all provided ranges. The data returned for a deallocated range
is specified in section 6.7.1.1."

^ permalink raw reply	[flat|nested] 10+ messages in thread

* [PATCH] nvme-pci: Add Write Zero support to Intel 660p
  2018-11-02  6:08           ` Christoph Hellwig
@ 2018-11-02 14:21             ` Keith Busch
  2018-11-03  8:21               ` Christoph Hellwig
  0 siblings, 1 reply; 10+ messages in thread
From: Keith Busch @ 2018-11-02 14:21 UTC (permalink / raw)


On Fri, Nov 02, 2018@07:08:15AM +0100, Christoph Hellwig wrote:
> On Thu, Nov 01, 2018@08:38:08AM -0600, Keith Busch wrote:
> > We currently use DSM Deallocate for REQ_OP_WRITE_ZEROES. We don't care
> > if the blocks are actually deallocated. We only care if reading them
> > back determinisitically returns 0's, and DLFEAT tells us if we can rely
> > on that behavior.
> 
> No, DLFEAT _only_ tells you the value you read from actually deallocated
> blocks.  From the DLFEAT description:
> 
> "Bits 2:0 indicate the values read from a deallocated logical block and its
> metadata (excluding protection information). The values for this field have
> the following meanings:"
> 
> And from Section 6.7:
> 
> "This command is advisory; a compliant controller may choose to take no
> action based on information provided."
> 
> "Attribute ? Deallocate (AD): If set to ?1?, then the NVM subsystem may
> deallocate all provided ranges. The data returned for a deallocated range
> is specified in section 6.7.1.1."

Okay, that wording is unfortunate and the feature is useless if that's
how it can be interpreted.

The "advisory" parts were originally directed toward the Context
Attributes referring to access type and frequency.

The intended behavior for Deallocate was a controller picks exactly
one behavior: return 0's, 1's, or the previous written data. It was
never intended, as far as I know, for a controller to return read
data differently for any given range after returning success to a DSM
Deallocate command.

That must be at least part of the motivation for Write Zeroes adding
deallocate support.

^ permalink raw reply	[flat|nested] 10+ messages in thread

* [PATCH] nvme-pci: Add Write Zero support to Intel 660p
  2018-11-02 14:21             ` Keith Busch
@ 2018-11-03  8:21               ` Christoph Hellwig
  0 siblings, 0 replies; 10+ messages in thread
From: Christoph Hellwig @ 2018-11-03  8:21 UTC (permalink / raw)


On Fri, Nov 02, 2018@08:21:15AM -0600, Keith Busch wrote:
> Okay, that wording is unfortunate and the feature is useless if that's
> how it can be interpreted.

While I agree that it is unfortunate it matches how ATA TRIM and
SCSI UNMAP work.  And that is how it happened - I vague remember
discussions in the working group.

^ permalink raw reply	[flat|nested] 10+ messages in thread

end of thread, other threads:[~2018-11-03  8:21 UTC | newest]

Thread overview: 10+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2018-10-26 22:41 [PATCH] nvme-pci: Add Write Zero support to Intel 660p Gwendal Grignou
2018-10-26 22:58 ` Keith Busch
2018-10-27  7:44   ` Christoph Hellwig
2018-10-27  9:19     ` Chaitanya Kulkarni
2018-10-29 15:24     ` Keith Busch
2018-11-01  5:27       ` Christoph Hellwig
2018-11-01 14:38         ` Keith Busch
2018-11-02  6:08           ` Christoph Hellwig
2018-11-02 14:21             ` Keith Busch
2018-11-03  8:21               ` Christoph Hellwig

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.