linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH 0/2] nvme: Add kernel API for admin command
@ 2019-09-13 11:16 Robert Baldyga
  2019-09-13 11:16 ` [PATCH 1/2] nvme: add API for sending admin commands by bdev Robert Baldyga
                   ` (2 more replies)
  0 siblings, 3 replies; 14+ messages in thread
From: Robert Baldyga @ 2019-09-13 11:16 UTC (permalink / raw)
  To: kbusch, axboe, hch, sagi, linux-nvme, linux-kernel
  Cc: michal.rakowski, Robert Baldyga

Hello,

This patchset adds two functions providing kernel to kernel API
for submiting NVMe admin commands. This is for use of NVMe-aware
block device drivers stacking on top of NVMe drives. An example of
such driver is Open CAS Linux [1] which uses NVMe extended LBA
formats and thus needs to issue commands like nvme_admin_identify.

[1] https://github.com/Open-CAS/open-cas-linux

Best regards,
Robert Baldyga

Michal Rakowski (1):
  nvme: add API for sending admin commands by bdev

Robert Baldyga (1):
  nvme: add API for getting nsid by bdev

 drivers/nvme/host/core.c | 37 +++++++++++++++++++++++++++++++++++++
 include/linux/nvme.h     |  5 +++++
 2 files changed, 42 insertions(+)

-- 
2.17.1

--------------------------------------------------------------------

Intel Technology Poland sp. z o.o.
ul. Slowackiego 173 | 80-298 Gdansk | Sad Rejonowy Gdansk Polnoc | VII Wydzial Gospodarczy Krajowego Rejestru Sadowego - KRS 101882 | NIP 957-07-52-316 | Kapital zakladowy 200.000 PLN.

Ta wiadomosc wraz z zalacznikami jest przeznaczona dla okreslonego adresata i moze zawierac informacje poufne. W razie przypadkowego otrzymania tej wiadomosci, prosimy o powiadomienie nadawcy oraz trwale jej usuniecie; jakiekolwiek
przegladanie lub rozpowszechnianie jest zabronione.
This e-mail and any attachments may contain confidential material for the sole use of the intended recipient(s). If you are not the intended recipient, please contact the sender and delete all copies; any review or distribution by
others is strictly prohibited.


^ permalink raw reply	[flat|nested] 14+ messages in thread

* [PATCH 1/2] nvme: add API for sending admin commands by bdev
  2019-09-13 11:16 [PATCH 0/2] nvme: Add kernel API for admin command Robert Baldyga
@ 2019-09-13 11:16 ` Robert Baldyga
  2019-09-16  6:36   ` kbuild test robot
  2019-09-13 11:16 ` [PATCH 2/2] nvme: add API for getting nsid " Robert Baldyga
  2019-09-13 14:37 ` [PATCH 0/2] nvme: Add kernel API for admin command Christoph Hellwig
  2 siblings, 1 reply; 14+ messages in thread
From: Robert Baldyga @ 2019-09-13 11:16 UTC (permalink / raw)
  To: kbusch, axboe, hch, sagi, linux-nvme, linux-kernel
  Cc: michal.rakowski, Robert Baldyga

From: Michal Rakowski <michal.rakowski@intel.com>

Add kernel API function for sending nvme admin commands.

Signed-off-by: Michal Rakowski <michal.rakowski@intel.com>
Signed-off-by: Robert Baldyga <robert.baldyga@intel.com>
---
 drivers/nvme/host/core.c | 23 +++++++++++++++++++++++
 include/linux/nvme.h     |  3 +++
 2 files changed, 26 insertions(+)

diff --git a/drivers/nvme/host/core.c b/drivers/nvme/host/core.c
index d3d6b7bd6903..06f917f391c4 100644
--- a/drivers/nvme/host/core.c
+++ b/drivers/nvme/host/core.c
@@ -812,6 +812,29 @@ int nvme_submit_sync_cmd(struct request_queue *q, struct nvme_command *cmd,
 }
 EXPORT_SYMBOL_GPL(nvme_submit_sync_cmd);
 
+int nvme_submit_admin_cmd_by_bdev(struct block_device *bdev,
+		struct nvme_command *c, void *buffer, unsigned int bufflen)
+{
+	struct nvme_ns *ns;
+	struct nvme_ctrl *ctrl;
+	int error;
+
+	if (!bdev && !c)
+		return -EINVAL;
+
+	ns = bdev->bd_disk->private_data;
+	ctrl = ns->ctrl;
+
+	error = nvme_submit_sync_cmd(ctrl->admin_q, c, buffer, bufflen);
+	if (error) {
+		dev_warn(ctrl->device, "Admin command failed (%d)\n", error);
+		return error;
+	}
+
+	return error;
+}
+EXPORT_SYMBOL_GPL(nvme_submit_admin_cmd_by_bdev);
+
 static void *nvme_add_user_metadata(struct bio *bio, void __user *ubuf,
 		unsigned len, u32 seed, bool write)
 {
diff --git a/include/linux/nvme.h b/include/linux/nvme.h
index 01aa6a6c241d..6f26c5654514 100644
--- a/include/linux/nvme.h
+++ b/include/linux/nvme.h
@@ -1392,4 +1392,7 @@ struct nvme_completion {
 #define NVME_MINOR(ver)		(((ver) >> 8) & 0xff)
 #define NVME_TERTIARY(ver)	((ver) & 0xff)
 
+int nvme_submit_admin_cmd_by_bdev(struct block_device *bdev,
+		struct nvme_command *c, void *buffer, unsigned int bufflen);
+
 #endif /* _LINUX_NVME_H */
-- 
2.17.1

--------------------------------------------------------------------

Intel Technology Poland sp. z o.o.
ul. Slowackiego 173 | 80-298 Gdansk | Sad Rejonowy Gdansk Polnoc | VII Wydzial Gospodarczy Krajowego Rejestru Sadowego - KRS 101882 | NIP 957-07-52-316 | Kapital zakladowy 200.000 PLN.

Ta wiadomosc wraz z zalacznikami jest przeznaczona dla okreslonego adresata i moze zawierac informacje poufne. W razie przypadkowego otrzymania tej wiadomosci, prosimy o powiadomienie nadawcy oraz trwale jej usuniecie; jakiekolwiek
przegladanie lub rozpowszechnianie jest zabronione.
This e-mail and any attachments may contain confidential material for the sole use of the intended recipient(s). If you are not the intended recipient, please contact the sender and delete all copies; any review or distribution by
others is strictly prohibited.


^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [PATCH 2/2] nvme: add API for getting nsid by bdev
  2019-09-13 11:16 [PATCH 0/2] nvme: Add kernel API for admin command Robert Baldyga
  2019-09-13 11:16 ` [PATCH 1/2] nvme: add API for sending admin commands by bdev Robert Baldyga
@ 2019-09-13 11:16 ` Robert Baldyga
  2019-09-16  6:57   ` kbuild test robot
  2019-09-13 14:37 ` [PATCH 0/2] nvme: Add kernel API for admin command Christoph Hellwig
  2 siblings, 1 reply; 14+ messages in thread
From: Robert Baldyga @ 2019-09-13 11:16 UTC (permalink / raw)
  To: kbusch, axboe, hch, sagi, linux-nvme, linux-kernel
  Cc: michal.rakowski, Robert Baldyga

Add kernel API function for getting nvme namespace id.

Signed-off-by: Robert Baldyga <robert.baldyga@intel.com>
---
 drivers/nvme/host/core.c | 14 ++++++++++++++
 include/linux/nvme.h     |  2 ++
 2 files changed, 16 insertions(+)

diff --git a/drivers/nvme/host/core.c b/drivers/nvme/host/core.c
index 06f917f391c4..35d121cd2378 100644
--- a/drivers/nvme/host/core.c
+++ b/drivers/nvme/host/core.c
@@ -812,6 +812,20 @@ int nvme_submit_sync_cmd(struct request_queue *q, struct nvme_command *cmd,
 }
 EXPORT_SYMBOL_GPL(nvme_submit_sync_cmd);
 
+int nvme_get_nsid_by_bdev(struct block_device *bdev, unsigned int *nsid)
+{
+	struct nvme_ns *ns;
+
+	if (!bdev && !nsid)
+		return -EINVAL;
+
+	ns = bdev->bd_disk->private_data;
+	*nsid = ns->head->ns_id;
+
+	return 0;
+}
+EXPORT_SYMBOL_GPL(nvme_nsid_by_bdev);
+
 int nvme_submit_admin_cmd_by_bdev(struct block_device *bdev,
 		struct nvme_command *c, void *buffer, unsigned int bufflen)
 {
diff --git a/include/linux/nvme.h b/include/linux/nvme.h
index 6f26c5654514..c54e210ad271 100644
--- a/include/linux/nvme.h
+++ b/include/linux/nvme.h
@@ -1392,6 +1392,8 @@ struct nvme_completion {
 #define NVME_MINOR(ver)		(((ver) >> 8) & 0xff)
 #define NVME_TERTIARY(ver)	((ver) & 0xff)
 
+int nvme_get_nsid_by_bdev(struct block_device *bdev, unsigned int *nsid);
+
 int nvme_submit_admin_cmd_by_bdev(struct block_device *bdev,
 		struct nvme_command *c, void *buffer, unsigned int bufflen);
 
-- 
2.17.1

--------------------------------------------------------------------

Intel Technology Poland sp. z o.o.
ul. Slowackiego 173 | 80-298 Gdansk | Sad Rejonowy Gdansk Polnoc | VII Wydzial Gospodarczy Krajowego Rejestru Sadowego - KRS 101882 | NIP 957-07-52-316 | Kapital zakladowy 200.000 PLN.

Ta wiadomosc wraz z zalacznikami jest przeznaczona dla okreslonego adresata i moze zawierac informacje poufne. W razie przypadkowego otrzymania tej wiadomosci, prosimy o powiadomienie nadawcy oraz trwale jej usuniecie; jakiekolwiek
przegladanie lub rozpowszechnianie jest zabronione.
This e-mail and any attachments may contain confidential material for the sole use of the intended recipient(s). If you are not the intended recipient, please contact the sender and delete all copies; any review or distribution by
others is strictly prohibited.


^ permalink raw reply related	[flat|nested] 14+ messages in thread

* Re: [PATCH 0/2] nvme: Add kernel API for admin command
  2019-09-13 11:16 [PATCH 0/2] nvme: Add kernel API for admin command Robert Baldyga
  2019-09-13 11:16 ` [PATCH 1/2] nvme: add API for sending admin commands by bdev Robert Baldyga
  2019-09-13 11:16 ` [PATCH 2/2] nvme: add API for getting nsid " Robert Baldyga
@ 2019-09-13 14:37 ` Christoph Hellwig
  2019-09-16  7:16   ` Baldyga, Robert
  2 siblings, 1 reply; 14+ messages in thread
From: Christoph Hellwig @ 2019-09-13 14:37 UTC (permalink / raw)
  To: Robert Baldyga
  Cc: kbusch, axboe, hch, sagi, linux-nvme, linux-kernel, michal.rakowski

On Fri, Sep 13, 2019 at 01:16:08PM +0200, Robert Baldyga wrote:
> Hello,
> 
> This patchset adds two functions providing kernel to kernel API
> for submiting NVMe admin commands. This is for use of NVMe-aware
> block device drivers stacking on top of NVMe drives. An example of
> such driver is Open CAS Linux [1] which uses NVMe extended LBA
> formats and thus needs to issue commands like nvme_admin_identify.

We never add functionality for out of tree crap.  And this shit really
is a bunch of crap, so it is unlikely to ever be merged. 

Why can't intel sometimes actually do something useful for a change
instead of piling junk over junk?

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH 1/2] nvme: add API for sending admin commands by bdev
  2019-09-13 11:16 ` [PATCH 1/2] nvme: add API for sending admin commands by bdev Robert Baldyga
@ 2019-09-16  6:36   ` kbuild test robot
  0 siblings, 0 replies; 14+ messages in thread
From: kbuild test robot @ 2019-09-16  6:36 UTC (permalink / raw)
  To: Robert Baldyga
  Cc: kbuild-all, kbusch, axboe, hch, sagi, linux-nvme, linux-kernel,
	michal.rakowski, Robert Baldyga

[-- Attachment #1: Type: text/plain, Size: 1595 bytes --]

Hi Robert,

Thank you for the patch! Perhaps something to improve:

[auto build test WARNING on linus/master]
[cannot apply to v5.3 next-20190915]
[if your patch is applied to the wrong git tree, please drop us a note to help improve the system]

url:    https://github.com/0day-ci/linux/commits/Robert-Baldyga/nvme-Add-kernel-API-for-admin-command/20190916-134358
config: sparc64-allmodconfig (attached as .config)
compiler: sparc64-linux-gcc (GCC) 7.4.0
reproduce:
        wget https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O ~/bin/make.cross
        chmod +x ~/bin/make.cross
        # save the attached .config to linux build tree
        GCC_VERSION=7.4.0 make.cross ARCH=sparc64 

If you fix the issue, kindly add following tag
Reported-by: kbuild test robot <lkp@intel.com>

All warnings (new ones prefixed by >>):

   In file included from drivers/nvme/host/nvme.h:9:0,
                    from drivers/nvme/host/fault_inject.c:9:
>> include/linux/nvme.h:1395:42: warning: 'struct block_device' declared inside parameter list will not be visible outside of this definition or declaration
    int nvme_submit_admin_cmd_by_bdev(struct block_device *bdev,
                                             ^~~~~~~~~~~~

vim +1395 include/linux/nvme.h

  1394	
> 1395	int nvme_submit_admin_cmd_by_bdev(struct block_device *bdev,
  1396			struct nvme_command *c, void *buffer, unsigned int bufflen);
  1397	

---
0-DAY kernel test infrastructure                Open Source Technology Center
https://lists.01.org/pipermail/kbuild-all                   Intel Corporation

[-- Attachment #2: .config.gz --]
[-- Type: application/gzip, Size: 58661 bytes --]

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH 2/2] nvme: add API for getting nsid by bdev
  2019-09-13 11:16 ` [PATCH 2/2] nvme: add API for getting nsid " Robert Baldyga
@ 2019-09-16  6:57   ` kbuild test robot
  0 siblings, 0 replies; 14+ messages in thread
From: kbuild test robot @ 2019-09-16  6:57 UTC (permalink / raw)
  To: Robert Baldyga
  Cc: kbuild-all, kbusch, axboe, hch, sagi, linux-nvme, linux-kernel,
	michal.rakowski, Robert Baldyga

[-- Attachment #1: Type: text/plain, Size: 2430 bytes --]

Hi Robert,

I love your patch! Yet something to improve:

[auto build test ERROR on linus/master]
[cannot apply to v5.3 next-20190915]
[if your patch is applied to the wrong git tree, please drop us a note to help improve the system]

url:    https://github.com/0day-ci/linux/commits/Robert-Baldyga/nvme-Add-kernel-API-for-admin-command/20190916-134358
config: sparc64-allmodconfig (attached as .config)
compiler: sparc64-linux-gcc (GCC) 7.4.0
reproduce:
        wget https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O ~/bin/make.cross
        chmod +x ~/bin/make.cross
        # save the attached .config to linux build tree
        GCC_VERSION=7.4.0 make.cross ARCH=sparc64 

If you fix the issue, kindly add following tag
Reported-by: kbuild test robot <lkp@intel.com>

All error/warnings (new ones prefixed by >>):

   In file included from include/linux/linkage.h:7:0,
                    from include/linux/kernel.h:8,
                    from include/asm-generic/bug.h:18,
                    from arch/sparc/include/asm/bug.h:25,
                    from include/linux/bug.h:5,
                    from include/linux/thread_info.h:12,
                    from arch/sparc/include/asm/current.h:15,
                    from include/linux/sched.h:12,
                    from include/linux/blkdev.h:5,
                    from drivers/nvme/host/core.c:7:
>> drivers/nvme/host/core.c:827:19: error: 'nvme_nsid_by_bdev' undeclared here (not in a function); did you mean 'nvme_get_nsid_by_bdev'?
    EXPORT_SYMBOL_GPL(nvme_nsid_by_bdev);
                      ^
   include/linux/export.h:79:16: note: in definition of macro '___EXPORT_SYMBOL'
     extern typeof(sym) sym;      \
                   ^~~
>> drivers/nvme/host/core.c:827:1: note: in expansion of macro 'EXPORT_SYMBOL_GPL'
    EXPORT_SYMBOL_GPL(nvme_nsid_by_bdev);
    ^~~~~~~~~~~~~~~~~

vim +827 drivers/nvme/host/core.c

   814	
   815	int nvme_get_nsid_by_bdev(struct block_device *bdev, unsigned int *nsid)
   816	{
   817		struct nvme_ns *ns;
   818	
   819		if (!bdev && !nsid)
   820			return -EINVAL;
   821	
   822		ns = bdev->bd_disk->private_data;
   823		*nsid = ns->head->ns_id;
   824	
   825		return 0;
   826	}
 > 827	EXPORT_SYMBOL_GPL(nvme_nsid_by_bdev);
   828	

---
0-DAY kernel test infrastructure                Open Source Technology Center
https://lists.01.org/pipermail/kbuild-all                   Intel Corporation

[-- Attachment #2: .config.gz --]
[-- Type: application/gzip, Size: 58661 bytes --]

^ permalink raw reply	[flat|nested] 14+ messages in thread

* RE: [PATCH 0/2] nvme: Add kernel API for admin command
  2019-09-13 14:37 ` [PATCH 0/2] nvme: Add kernel API for admin command Christoph Hellwig
@ 2019-09-16  7:16   ` Baldyga, Robert
  2019-09-16  7:34     ` Christoph Hellwig
  0 siblings, 1 reply; 14+ messages in thread
From: Baldyga, Robert @ 2019-09-16  7:16 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: kbusch, axboe, sagi, linux-nvme, linux-kernel, Rakowski, Michal

On Fri, Sep 13, 2019 at 04:37:09PM +0200, Christoph Hellwig wrote:
> On Fri, Sep 13, 2019 at 01:16:08PM +0200, Robert Baldyga wrote:
> > Hello,
> > 
> > This patchset adds two functions providing kernel to kernel API 
> > for submiting NVMe admin commands. This is for use of NVMe-aware
> > block device drivers stacking on top of NVMe drives. An example of
> > such driver is Open CAS Linux [1] which uses NVMe extended LBA 
> > formats and thus needs to issue commands like nvme_admin_identify.
> 
> We never add functionality for out of tree crap.  And this shit really
> is a bunch of crap, so it is unlikely to ever be merged. 

So that modules which are by design out of tree have to hack around
lack of API allowing to use functionality implemented by driver.
Don't you think that this is what actually produces crap?

> Why can't intel sometimes actually do something useful for a change
> instead of piling junk over junk?

Proposed API is equally useful for both in tree and out of tree modules,
so I find your comment unrelated.

If you don't like the way it's done, we can look for alternatives.
The point is to allow other drivers use NVMe admin commands, which is
currently not possible as neither the block layer nor the nvme driver
provides sufficient API.

Best regards,
Robert Baldyga
--------------------------------------------------------------------

Intel Technology Poland sp. z o.o.
ul. Slowackiego 173 | 80-298 Gdansk | Sad Rejonowy Gdansk Polnoc | VII Wydzial Gospodarczy Krajowego Rejestru Sadowego - KRS 101882 | NIP 957-07-52-316 | Kapital zakladowy 200.000 PLN.

Ta wiadomosc wraz z zalacznikami jest przeznaczona dla okreslonego adresata i moze zawierac informacje poufne. W razie przypadkowego otrzymania tej wiadomosci, prosimy o powiadomienie nadawcy oraz trwale jej usuniecie; jakiekolwiek
przegladanie lub rozpowszechnianie jest zabronione.
This e-mail and any attachments may contain confidential material for the sole use of the intended recipient(s). If you are not the intended recipient, please contact the sender and delete all copies; any review or distribution by
others is strictly prohibited.


^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH 0/2] nvme: Add kernel API for admin command
  2019-09-16  7:16   ` Baldyga, Robert
@ 2019-09-16  7:34     ` Christoph Hellwig
  2019-09-16 12:13       ` Baldyga, Robert
  0 siblings, 1 reply; 14+ messages in thread
From: Christoph Hellwig @ 2019-09-16  7:34 UTC (permalink / raw)
  To: Baldyga, Robert
  Cc: Christoph Hellwig, kbusch, axboe, sagi, linux-nvme, linux-kernel,
	Rakowski, Michal

On Mon, Sep 16, 2019 at 07:16:52AM +0000, Baldyga, Robert wrote:
> On Fri, Sep 13, 2019 at 04:37:09PM +0200, Christoph Hellwig wrote:
> > On Fri, Sep 13, 2019 at 01:16:08PM +0200, Robert Baldyga wrote:
> > > Hello,
> > > 
> > > This patchset adds two functions providing kernel to kernel API 
> > > for submiting NVMe admin commands. This is for use of NVMe-aware
> > > block device drivers stacking on top of NVMe drives. An example of
> > > such driver is Open CAS Linux [1] which uses NVMe extended LBA 
> > > formats and thus needs to issue commands like nvme_admin_identify.
> > 
> > We never add functionality for out of tree crap.  And this shit really
> > is a bunch of crap, so it is unlikely to ever be merged. 
> 
> So that modules which are by design out of tree have to hack around
> lack of API allowing to use functionality implemented by driver.
> Don't you think that this is what actually produces crap?

Because you added another badly designed caching layer instead of fixing
up one of the 2 to 3 (depending on how you count) in the kernel tree.
And yours is amazingly bad even compared to the not very nice one in the
tree.  It basically spends more efforts on abstracting away abstraction
that you wouldn't need without another layer of abstractions.  And it
does all that pointlessly tied to NVMe through a bunch of layering
violations.

> 
> > Why can't intel sometimes actually do something useful for a change
> > instead of piling junk over junk?
> 
> Proposed API is equally useful for both in tree and out of tree modules,
> so I find your comment unrelated.

It is not.  We will not let random kernel modules directly issue nvme
admin command as there is no reason for it.  And even if for some weird
reason we did we'd certainly not do it for out of tree modules.

> If you don't like the way it's done, we can look for alternatives.
> The point is to allow other drivers use NVMe admin commands, which is
> currently not possible as neither the block layer nor the nvme driver
> provides sufficient API.

And the answer is that we are not going to allow that.  And we'd not
allow other things even if they were not a bad design for out of tree
modules.

^ permalink raw reply	[flat|nested] 14+ messages in thread

* RE: [PATCH 0/2] nvme: Add kernel API for admin command
  2019-09-16  7:34     ` Christoph Hellwig
@ 2019-09-16 12:13       ` Baldyga, Robert
  2019-09-17 16:39         ` Keith Busch
  0 siblings, 1 reply; 14+ messages in thread
From: Baldyga, Robert @ 2019-09-16 12:13 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: kbusch, axboe, sagi, linux-nvme, linux-kernel, Rakowski, Michal

On Mon, Sep 16, 2019 at 09:34:55AM +0200, Christoph Hellwig wrote:
> On Mon, Sep 16, 2019 at 07:16:52AM +0000, Baldyga, Robert wrote:
> > On Fri, Sep 13, 2019 at 04:37:09PM +0200, Christoph Hellwig wrote:
> > > On Fri, Sep 13, 2019 at 01:16:08PM +0200, Robert Baldyga wrote:
> > > > Hello,
> > > > 
> > > > This patchset adds two functions providing kernel to kernel API 
> > > > for submiting NVMe admin commands. This is for use of NVMe-aware
> > > > block device drivers stacking on top of NVMe drives. An example of
> > > > such driver is Open CAS Linux [1] which uses NVMe extended LBA 
> > > > formats and thus needs to issue commands like nvme_admin_identify.
> > > 
> > > We never add functionality for out of tree crap.  And this shit really
> > > is a bunch of crap, so it is unlikely to ever be merged. 
> > 
> > So that modules which are by design out of tree have to hack around
> > lack of API allowing to use functionality implemented by driver.
> > Don't you think that this is what actually produces crap?
> 
> Because you added another badly designed caching layer instead of fixing
> up one of the 2 to 3 (depending on how you count) in the kernel tree.
> And yours is amazingly bad even compared to the not very nice one in the 
> tree.  It basically spends more efforts on abstracting away abstraction
> that you wouldn't need without another layer of abstractions.  And it
> does all that pointlessly tied to NVMe through a bunch of layering
> violations.
> 
> > 
> > > Why can't intel sometimes actually do something useful for a change
> > > instead of piling junk over junk?
> > 
> > Proposed API is equally useful for both in tree and out of tree modules,
> > so I find your comment unrelated.
> 
> It is not.  We will not let random kernel modules directly issue nvme
> admin command as there is no reason for it.  And even if for some weird
> reason we did we'd certainly not do it for out of tree modules.

Ok, fair enough. We want to keep things hidden behind certain layers,
and that's definitely a good thing. But there is a problem with these
layers - they do not expose all the features. For example AFAIK there
is no clear way to use 512+8 format via block layer (nor even a way 
to get know that bdev is formatted to particular lbaf). It's impossible
to use it without layering violations, which nobody sees as a perfect
solution, but it is the only one that works.

> > If you don't like the way it's done, we can look for alternatives.
> > The point is to allow other drivers use NVMe admin commands, which is
> > currently not possible as neither the block layer nor the nvme driver
> > provides sufficient API.
> 
> And the answer is that we are not going to allow that.  And we'd not 
> allow other things even if they were not a bad design for out of tree
> modules.

I'm not going to insist on merging changes dedicated for out of tree
modules. I'd be happy to get involved in improving kernel support for 
NVMe so that there would be no need for that changes. What I proposed
is a quick fix for the problem that, for sure, can be solved another,
better way.

And this is not about being in tree or out of tree - a lot of out of
tree modules are doing well without any special care from the upstream.
This is about lack of support for certain features. If you have any
proposal how we should solve this problem in a better way than I
proposed in my humble patchset, I will gladly submit new patches.

Best regards,
Robert Baldyga

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH 0/2] nvme: Add kernel API for admin command
  2019-09-16 12:13       ` Baldyga, Robert
@ 2019-09-17 16:39         ` Keith Busch
  2019-09-18 13:26           ` Christoph Hellwig
  0 siblings, 1 reply; 14+ messages in thread
From: Keith Busch @ 2019-09-17 16:39 UTC (permalink / raw)
  To: Baldyga, Robert
  Cc: Christoph Hellwig, axboe, sagi, linux-nvme, linux-kernel,
	Rakowski, Michal

On Mon, Sep 16, 2019 at 12:13:24PM +0000, Baldyga, Robert wrote:
> Ok, fair enough. We want to keep things hidden behind certain layers,
> and that's definitely a good thing. But there is a problem with these
> layers - they do not expose all the features. For example AFAIK there
> is no clear way to use 512+8 format via block layer (nor even a way 
> to get know that bdev is formatted to particular lbaf). It's impossible
> to use it without layering violations, which nobody sees as a perfect
> solution, but it is the only one that works.

I think your quickest path to supporting such a format is to consider
each part separately, then bounce and interleave/unmix the data +
metadata at another layer that understands how the data needs to be laid
out in host memory. Something like this RFC here:

  http://lists.infradead.org/pipermail/linux-nvme/2018-February/015844.html

It appears connecting to infradead lists is a bit flaky at the moment,
so not sure if you'll be able to read the above link right now.

Anyway, the RFC would have needed a bit of work to be considered, like
using a mempool for the purpose, but it does generically make such
formats usable through the block stack so maybe try flushing out that
idea.

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH 0/2] nvme: Add kernel API for admin command
  2019-09-17 16:39         ` Keith Busch
@ 2019-09-18 13:26           ` Christoph Hellwig
  2019-09-18 17:08             ` Keith Busch
  0 siblings, 1 reply; 14+ messages in thread
From: Christoph Hellwig @ 2019-09-18 13:26 UTC (permalink / raw)
  To: Keith Busch
  Cc: Baldyga, Robert, Christoph Hellwig, axboe, sagi, linux-nvme,
	linux-kernel, Rakowski, Michal

On Tue, Sep 17, 2019 at 10:39:09AM -0600, Keith Busch wrote:
> On Mon, Sep 16, 2019 at 12:13:24PM +0000, Baldyga, Robert wrote:
> > Ok, fair enough. We want to keep things hidden behind certain layers,
> > and that's definitely a good thing. But there is a problem with these
> > layers - they do not expose all the features. For example AFAIK there
> > is no clear way to use 512+8 format via block layer (nor even a way 
> > to get know that bdev is formatted to particular lbaf). It's impossible
> > to use it without layering violations, which nobody sees as a perfect
> > solution, but it is the only one that works.
> 
> I think your quickest path to supporting such a format is to consider
> each part separately, then bounce and interleave/unmix the data +
> metadata at another layer that understands how the data needs to be laid
> out in host memory. Something like this RFC here:
> 
>   http://lists.infradead.org/pipermail/linux-nvme/2018-February/015844.html
> 
> It appears connecting to infradead lists is a bit flaky at the moment,
> so not sure if you'll be able to read the above link right now.
> 
> Anyway, the RFC would have needed a bit of work to be considered, like
> using a mempool for the purpose, but it does generically make such
> formats usable through the block stack so maybe try flushing out that
> idea.

Even if we had a use case for that the bounce buffer is just too ugly
to live.  And I'm really sick and tired of Intel wasting our time for
their out of tree monster given that they haven't even tried helping
to improve the in-kernel write caching layers.

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH 0/2] nvme: Add kernel API for admin command
  2019-09-18 13:26           ` Christoph Hellwig
@ 2019-09-18 17:08             ` Keith Busch
  2019-09-18 17:09               ` Christoph Hellwig
  0 siblings, 1 reply; 14+ messages in thread
From: Keith Busch @ 2019-09-18 17:08 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: Baldyga, Robert, axboe, sagi, linux-nvme, linux-kernel, Rakowski, Michal

On Wed, Sep 18, 2019 at 03:26:11PM +0200, Christoph Hellwig wrote:
> Even if we had a use case for that the bounce buffer is just too ugly
> to live.  And I'm really sick and tired of Intel wasting our time for
> their out of tree monster given that they haven't even tried helping
> to improve the in-kernel write caching layers.

Right, I don't have any stake in that out-of-tree caching either. I am
more just interested in trying to get Linux to generically reach as many
NVMe spec features as possible, and extended formats have been in-spec
since the beginning of nvme.

And yes, that bouncing is really nasty, but it's really only needed for
PRP, so maybe let's just not ignore that transfer mode and support
extended metadata iff the controller supports SGLs. We just need a
special SGL setup routine to weave the data and metadata.

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH 0/2] nvme: Add kernel API for admin command
  2019-09-18 17:08             ` Keith Busch
@ 2019-09-18 17:09               ` Christoph Hellwig
  2019-09-19  0:40                 ` Martin K. Petersen
  0 siblings, 1 reply; 14+ messages in thread
From: Christoph Hellwig @ 2019-09-18 17:09 UTC (permalink / raw)
  To: Keith Busch
  Cc: Christoph Hellwig, Baldyga, Robert, axboe, sagi, linux-nvme,
	linux-kernel, Rakowski, Michal

On Wed, Sep 18, 2019 at 11:08:07AM -0600, Keith Busch wrote:
> And yes, that bouncing is really nasty, but it's really only needed for
> PRP, so maybe let's just not ignore that transfer mode and support
> extended metadata iff the controller supports SGLs. We just need a
> special SGL setup routine to weave the data and metadata.

Well, what is the point?  If people really want to use metadata they
should just buy a drive supporting the separate metadata pointer.  In
fact I haven't had to deal with a drive that only supports interleaved
metadata so far given how awkward that is to deal with.  But based on
the discussions here it seems Intel is stupid enough to ship such a thing.

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH 0/2] nvme: Add kernel API for admin command
  2019-09-18 17:09               ` Christoph Hellwig
@ 2019-09-19  0:40                 ` Martin K. Petersen
  0 siblings, 0 replies; 14+ messages in thread
From: Martin K. Petersen @ 2019-09-19  0:40 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: Keith Busch, sagi, linux-kernel, linux-nvme, Rakowski, Michal,
	axboe, Baldyga, Robert


Christoph,

> On Wed, Sep 18, 2019 at 11:08:07AM -0600, Keith Busch wrote:
>> And yes, that bouncing is really nasty, but it's really only needed for
>> PRP, so maybe let's just not ignore that transfer mode and support
>> extended metadata iff the controller supports SGLs. We just need a
>> special SGL setup routine to weave the data and metadata.
>
> Well, what is the point?  If people really want to use metadata they
> should just buy a drive supporting the separate metadata pointer.  In
> fact I haven't had to deal with a drive that only supports interleaved
> metadata so far given how awkward that is to deal with.

Yep. There's a reason we did DIX...

-- 
Martin K. Petersen	Oracle Linux Engineering

^ permalink raw reply	[flat|nested] 14+ messages in thread

end of thread, other threads:[~2019-09-19  0:40 UTC | newest]

Thread overview: 14+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-09-13 11:16 [PATCH 0/2] nvme: Add kernel API for admin command Robert Baldyga
2019-09-13 11:16 ` [PATCH 1/2] nvme: add API for sending admin commands by bdev Robert Baldyga
2019-09-16  6:36   ` kbuild test robot
2019-09-13 11:16 ` [PATCH 2/2] nvme: add API for getting nsid " Robert Baldyga
2019-09-16  6:57   ` kbuild test robot
2019-09-13 14:37 ` [PATCH 0/2] nvme: Add kernel API for admin command Christoph Hellwig
2019-09-16  7:16   ` Baldyga, Robert
2019-09-16  7:34     ` Christoph Hellwig
2019-09-16 12:13       ` Baldyga, Robert
2019-09-17 16:39         ` Keith Busch
2019-09-18 13:26           ` Christoph Hellwig
2019-09-18 17:08             ` Keith Busch
2019-09-18 17:09               ` Christoph Hellwig
2019-09-19  0:40                 ` Martin K. Petersen

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).