Linux-NVME Archive on lore.kernel.org
 help / color / Atom feed
From: Mark Ruijter <MRuijter@onestopsystems.com>
To: Keith Busch <kbusch@kernel.org>
Cc: Chaitanya Kulkarni <chaitanya.kulkarni@wdc.com>,
	"hch@lst.de" <hch@lst.de>,
	"linux-nvme@lists.infradead.org" <linux-nvme@lists.infradead.org>,
	"sagi@grimberg.me" <sagi@grimberg.me>
Subject: Re: [PATCH] nvmet: introduce use_vfs ns-attr
Date: Fri, 25 Oct 2019 08:44:00 +0000
Message-ID: <109617B2-CC73-4CDE-B97A-FDDB12CD22BD@onestopsystems.com> (raw)
In-Reply-To: <20191025042658.GB19941@redsun51.ssa.fujisawa.hgst.com>


Hi Keith,

I am indeed not using buffered io.
Using the VFS increases my 4k random write performance from 200K to 650K when using raid1. 
So the difference is huge and becomes more significant when the underlying drives or raid0 can handle more iops.

Over the next few days I am going to provide a number of patches.

1. Currently a controller id collision can occur when using a clustered HA setup. See this message:
>>> [1122789.054677] nvme nvme1: Duplicate cntlid 4 with nvme0, rejecting.

The controller ID is currently hard wired.

       ret = ida_simple_get(&cntlid_ida,
                             NVME_CNTLID_MIN, NVME_CNTLID_MAX,
                             GFP_KERNEL);

So two nodes exporting the exact same volume using the same port configuration can easily come up with the same controller id.
I would like to propose to make it configurable, but with the current logic setting a default.
SCST for example allows manual target id selection for this reason.

2. The Model of the drives has been hard wired to Linux. As I see it this should be configurable with 'Linux' as default value.
I'll provide code that makes that work.

3. A NVMEoF connected disk on the initiator seems to queue forever when the target dies.
It would be nice if we had the ability to select either 'queue foreever' or 'failfast'.

I hope this makes sense,

Mark Ruijter

Op 25-10-19 06:27 heeft Keith Busch <kbusch@kernel.org> geschreven:

    On Fri, Oct 25, 2019 at 01:05:40PM +0900, Keith Busch wrote:
    > On Thu, Oct 24, 2019 at 11:30:18AM +0000, Mark Ruijter wrote:
    > > Note that I wrote this patch to prove that a performance problem exists when using raid1.
    > > Either the md raid1 driver or the io-cmd-bdev.c code has issues.
    > > When you add an additional layer like the VFS the performance should typically drop with 5~10%.
    > > However in this case the performance increases even though the nvme target uses direct-io and the random writes do not get merged by the VFS.
    > 
    > Are we really using direct io when nvme target is going through vfs,
    > though? That should happen if we've set IOCB_DIRECT in the ki_flags,
    > but I don't see that happening, and if that's right, then the difference
    > sounds like it's related to buffered IO.
    
    Err, I see we actually default to direct when we open the file. You'd
    have to change that through configfs to use buffered, which I assume
    you're not doing. My mistake.
    

_______________________________________________
Linux-nvme mailing list
Linux-nvme@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-nvme

  reply index

Thread overview: 19+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-10-23 20:17 Chaitanya Kulkarni
2019-10-24  2:00 ` Keith Busch
2019-10-24 11:30   ` Mark Ruijter
2019-10-25  4:05     ` Keith Busch
2019-10-25  4:26       ` Keith Busch
2019-10-25  8:44         ` Mark Ruijter [this message]
2019-10-26  1:06           ` Keith Busch
2019-10-27 15:03           ` hch
2019-10-27 16:06             ` Mark Ruijter
2019-10-28  0:55             ` Keith Busch
2019-10-28  7:26               ` Chaitanya Kulkarni
2019-10-28  7:32               ` Chaitanya Kulkarni
2019-10-28  7:35                 ` hch
2019-10-28  7:38                   ` Chaitanya Kulkarni
2019-10-28  7:43                     ` hch
2019-10-28  8:04                       ` Chaitanya Kulkarni
2019-10-28  8:01                 ` Keith Busch
2019-10-28  8:41                   ` Mark Ruijter
2019-10-25  3:29   ` Chaitanya Kulkarni

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=109617B2-CC73-4CDE-B97A-FDDB12CD22BD@onestopsystems.com \
    --to=mruijter@onestopsystems.com \
    --cc=chaitanya.kulkarni@wdc.com \
    --cc=hch@lst.de \
    --cc=kbusch@kernel.org \
    --cc=linux-nvme@lists.infradead.org \
    --cc=sagi@grimberg.me \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Linux-NVME Archive on lore.kernel.org

Archives are clonable:
	git clone --mirror https://lore.kernel.org/linux-nvme/0 linux-nvme/git/0.git

	# If you have public-inbox 1.1+ installed, you may
	# initialize and index your mirror using the following commands:
	public-inbox-init -V2 linux-nvme linux-nvme/ https://lore.kernel.org/linux-nvme \
		linux-nvme@lists.infradead.org
	public-inbox-index linux-nvme

Example config snippet for mirrors

Newsgroup available over NNTP:
	nntp://nntp.lore.kernel.org/org.infradead.lists.linux-nvme


AGPL code for this site: git clone https://public-inbox.org/public-inbox.git