linux-nvme.lists.infradead.org archive mirror
 help / color / mirror / Atom feed
From: "hch@lst.de" <hch@lst.de>
To: Mark Ruijter <MRuijter@onestopsystems.com>
Cc: Hannes Reinecke <hare@suse.com>,
	"sagi@grimberg.me" <sagi@grimberg.me>,
	Chaitanya Kulkarni <chaitanya.kulkarni@wdc.com>,
	"linux-nvme@lists.infradead.org" <linux-nvme@lists.infradead.org>,
	Keith Busch <kbusch@kernel.org>, "hch@lst.de" <hch@lst.de>
Subject: Re: [PATCH] nvmet: introduce use_vfs ns-attr
Date: Sun, 27 Oct 2019 16:03:30 +0100	[thread overview]
Message-ID: <20191027150330.GA5843@lst.de> (raw)
In-Reply-To: <109617B2-CC73-4CDE-B97A-FDDB12CD22BD@onestopsystems.com>

On Fri, Oct 25, 2019 at 08:44:00AM +0000, Mark Ruijter wrote:
> 
> Hi Keith,
> 
> I am indeed not using buffered io.
> Using the VFS increases my 4k random write performance from 200K to 650K when using raid1. 
> So the difference is huge and becomes more significant when the underlying drives or raid0 can handle more iops.

Can you try the patch below to use block layer plugging in nvmet?  That
should be the only major difference in how we do I/O.

> 1. Currently a controller id collision can occur when using a clustered HA setup. See this message:
> >>> [1122789.054677] nvme nvme1: Duplicate cntlid 4 with nvme0, rejecting.
> 
> The controller ID is currently hard wired.
> 
>        ret = ida_simple_get(&cntlid_ida,
>                              NVME_CNTLID_MIN, NVME_CNTLID_MAX,
>                              GFP_KERNEL);
> 
> So two nodes exporting the exact same volume using the same port configuration can easily come up with the same controller id.
> I would like to propose to make it configurable, but with the current logic setting a default.
> SCST for example allows manual target id selection for this reason.

We can allow some control there using a new configfs file.  But what
would be even better is an actually integrated cluster manager, which
we'd need to support features such as persistent reservations.

> 2. The Model of the drives has been hard wired to Linux. As I see it this should be configurable with 'Linux' as default value.
> I'll provide code that makes that work.

Yes, please send a patch.

> 3. A NVMEoF connected disk on the initiator seems to queue forever when the target dies.
> It would be nice if we had the ability to select either 'queue foreever' or 'failfast'.

Making this configurable has been a long time todo list item.  At some
point in the past Hannes (added to Cc) signed up for it, but it seems
to have dropped off his priority list.

---
From 87ab0d6f9e092cde04775452131f90e8b4c46a66 Mon Sep 17 00:00:00 2001
From: Christoph Hellwig <hch@lst.de>
Date: Sun, 27 Oct 2019 15:59:08 +0100
Subject: nvmet: use block layer plugging in nvmet_bdev_execute_rw

Signed-off-by: Christoph Hellwig <hch@lst.de>
---
 drivers/nvme/target/io-cmd-bdev.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/drivers/nvme/target/io-cmd-bdev.c b/drivers/nvme/target/io-cmd-bdev.c
index 04a9cd2a2604..ed1a8d0ab30e 100644
--- a/drivers/nvme/target/io-cmd-bdev.c
+++ b/drivers/nvme/target/io-cmd-bdev.c
@@ -147,6 +147,7 @@ static void nvmet_bdev_execute_rw(struct nvmet_req *req)
 	int sg_cnt = req->sg_cnt;
 	struct bio *bio;
 	struct scatterlist *sg;
+	struct blk_plug plug;
 	sector_t sector;
 	int op, op_flags = 0, i;
 
@@ -185,6 +186,7 @@ static void nvmet_bdev_execute_rw(struct nvmet_req *req)
 	bio->bi_end_io = nvmet_bio_done;
 	bio_set_op_attrs(bio, op, op_flags);
 
+	blk_start_plug(&plug);
 	for_each_sg(req->sg, sg, req->sg_cnt, i) {
 		while (bio_add_page(bio, sg_page(sg), sg->length, sg->offset)
 				!= sg->length) {
@@ -202,6 +204,7 @@ static void nvmet_bdev_execute_rw(struct nvmet_req *req)
 		sector += sg->length >> 9;
 		sg_cnt--;
 	}
+	blk_finish_plug(&plug);
 
 	submit_bio(bio);
 }
-- 
2.20.1


_______________________________________________
Linux-nvme mailing list
Linux-nvme@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-nvme

  parent reply	other threads:[~2019-10-27 15:04 UTC|newest]

Thread overview: 19+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-10-23 20:17 [PATCH] nvmet: introduce use_vfs ns-attr Chaitanya Kulkarni
2019-10-24  2:00 ` Keith Busch
2019-10-24 11:30   ` Mark Ruijter
2019-10-25  4:05     ` Keith Busch
2019-10-25  4:26       ` Keith Busch
2019-10-25  8:44         ` Mark Ruijter
2019-10-26  1:06           ` Keith Busch
2019-10-27 15:03           ` hch [this message]
2019-10-27 16:06             ` Mark Ruijter
2019-10-28  0:55             ` Keith Busch
2019-10-28  7:26               ` Chaitanya Kulkarni
2019-10-28  7:32               ` Chaitanya Kulkarni
2019-10-28  7:35                 ` hch
2019-10-28  7:38                   ` Chaitanya Kulkarni
2019-10-28  7:43                     ` hch
2019-10-28  8:04                       ` Chaitanya Kulkarni
2019-10-28  8:01                 ` Keith Busch
2019-10-28  8:41                   ` Mark Ruijter
2019-10-25  3:29   ` Chaitanya Kulkarni

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20191027150330.GA5843@lst.de \
    --to=hch@lst.de \
    --cc=MRuijter@onestopsystems.com \
    --cc=chaitanya.kulkarni@wdc.com \
    --cc=hare@suse.com \
    --cc=kbusch@kernel.org \
    --cc=linux-nvme@lists.infradead.org \
    --cc=sagi@grimberg.me \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).