linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: "Elliott, Robert (Server Storage)" <Elliott@hp.com>
To: Christoph Hellwig <hch@infradead.org>
Cc: Andreas Dilger <adilger@dilger.ca>,
	Milosz Tanski <milosz@adfin.com>,
	"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>,
	"linux-fsdevel@vger.kernel.org" <linux-fsdevel@vger.kernel.org>,
	"linux-aio@kvack.org" <linux-aio@kvack.org>,
	Mel Gorman <mgorman@suse.de>,
	Volker Lendecke <Volker.Lendecke@sernet.de>,
	Tejun Heo <tj@kernel.org>, Jeff Moyer <jmoyer@redhat.com>
Subject: RE: [RFC PATCH 0/7] Non-blockling buffered fs read (page cache only)
Date: Mon, 22 Sep 2014 16:25:24 +0000	[thread overview]
Message-ID: <94D0CD8314A33A4D9D801C0FE68B402958C9102B@G9W0745.americas.hpqcorp.net> (raw)
In-Reply-To: <20140919112147.GA4639@infradead.org>



> -----Original Message-----
> From: Christoph Hellwig [mailto:hch@infradead.org]
> Sent: Friday, 19 September, 2014 6:22 AM
> To: Elliott, Robert (Server Storage)
> Cc: Andreas Dilger; Milosz Tanski; linux-kernel@vger.kernel.org; Christoph
> Hellwig; linux-fsdevel@vger.kernel.org; linux-aio@kvack.org; Mel Gorman;
> Volker Lendecke; Tejun Heo; Jeff Moyer
> Subject: Re: [RFC PATCH 0/7] Non-blockling buffered fs read (page cache only)
> 
> On Mon, Sep 15, 2014 at 10:36:46PM +0000, Elliott, Robert (Server Storage)
> wrote:
> > That sounds like the proposed WRITE SCATTERED/READ GATHERED
> > commands for SCSI (where are related to, but not necessarily
> > tied to, atomic writes).  We discussed them a bit at
> > LSF-MM 2013 - see http://lwn.net/Articles/548116/.
> 
> In the same way a preadx/pwritex could use but would not require an
> O_ATOMIC.  What's the status of those in t10?  Last I heard
> READ GATHERED was out and they were only looking into WRITE SCATTERED?

Both of these essentially require more CDB bytes to convey the
LBA range list.  Under the current SCSI architecture model, the 
choices are:
* include in a longer CDB
* include in the data-out buffer

For longer CDBs:
* CDBs >16 bytes are not widely supported
* 260 byte max CDB size limits the number of LBA ranges
* in most SCSI protocols, commands are unsolicited (push rather
than pull), so the target must have buffer space for (max queue
depth)*(max CDB size). In SCSI Express, although CDBs are pulled
with PCIe memory reads rather than pushed, longer CDBs complicate
circular queue handling.

For the data-out buffer:
* not delivering all the CDB info upfront complicates drive 
hardware designs. They want to get the data transfer started
from the medium, but have to wait for a whole extra DMA 
transfer first. This is not so bad for low-latency PCIe,
but is not a good fit for protocols behind HBAs like
SAS, iSCSI, etc.
* READ GATHERED requires bidirectional command support, which 
is not widely or efficiently supported

Protocols could add direct support for delivering more CDB bytes
(like how the ATA PACKET command delivers a SCSI CDB over
an ATA transport), but that requires a lot of changes.

---
Rob Elliott    HP Server Storage



  parent reply	other threads:[~2014-09-22 16:26 UTC|newest]

Thread overview: 86+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2014-09-15 20:20 [RFC PATCH 0/7] Non-blockling buffered fs read (page cache only) Milosz Tanski
2014-09-15 20:20 ` [PATCH 1/7] Prepare for adding a new readv/writev with user flags Milosz Tanski
2014-09-15 20:28   ` Al Viro
2014-09-15 21:15     ` Christoph Hellwig
2014-09-15 21:44       ` Milosz Tanski
2014-09-15 20:20 ` [PATCH 2/7] Define new syscalls readv2,preadv2,writev2,pwritev2 Milosz Tanski
2014-09-16 19:20   ` Jeff Moyer
2014-09-16 19:54     ` Milosz Tanski
2014-09-16 21:03     ` Christoph Hellwig
2014-09-17 15:43   ` Theodore Ts'o
2014-09-17 16:05     ` Milosz Tanski
2014-09-17 16:59       ` Theodore Ts'o
2014-09-17 17:24         ` Zach Brown
2014-09-15 20:20 ` [PATCH 3/7] Export new vector IO (with flags) to userland Milosz Tanski
2014-09-15 20:21 ` [PATCH 4/7] O_NONBLOCK flag for readv2/preadv2 Milosz Tanski
2014-09-16 19:19   ` Jeff Moyer
2014-09-16 19:44     ` Milosz Tanski
2014-09-16 19:53       ` Jeff Moyer
2014-09-15 20:21 ` [PATCH 5/7] documentation updates Christoph Hellwig
2014-09-15 20:21 ` [PATCH 6/7] move flags enforcement to vfs_preadv/vfs_pwritev Christoph Hellwig
2014-09-15 21:15   ` Christoph Hellwig
2014-09-15 21:45     ` Milosz Tanski
2014-09-15 20:22 ` [PATCH 7/7] check for O_NONBLOCK in all read_iter instances Christoph Hellwig
2014-09-16 19:27   ` Jeff Moyer
2014-09-16 19:45     ` Milosz Tanski
2014-09-16 21:42       ` Dave Chinner
2014-09-17 12:24         ` Benjamin LaHaise
2014-09-17 13:47           ` Theodore Ts'o
2014-09-17 13:56             ` Benjamin LaHaise
2014-09-17 15:33               ` Milosz Tanski
2014-09-17 15:49                 ` Theodore Ts'o
2014-09-17 15:52               ` Zach Brown
2014-09-16 21:04     ` Christoph Hellwig
2014-09-16 21:24       ` Jeff Moyer
2014-09-15 20:27 ` [RFC PATCH 0/7] Non-blockling buffered fs read (page cache only) Milosz Tanski
2014-09-15 21:33 ` Andreas Dilger
2014-09-15 22:13   ` Milosz Tanski
2014-09-15 22:36   ` Elliott, Robert (Server Storage)
2014-09-16 18:24     ` Zach Brown
2014-09-19 11:21     ` Christoph Hellwig
2014-09-22 15:48       ` Jeff Moyer
2014-09-22 16:32         ` Milosz Tanski
2014-09-22 16:42           ` Christoph Hellwig
2014-09-22 17:02             ` Milosz Tanski
2014-09-22 16:25       ` Elliott, Robert (Server Storage) [this message]
2014-09-15 21:58 ` Jeff Moyer
2014-09-15 22:27   ` Milosz Tanski
2014-09-16 13:44     ` Jeff Moyer
2014-09-19 11:23   ` Christoph Hellwig
2014-09-16 19:30 ` Jeff Moyer
2014-09-16 20:34   ` Milosz Tanski
2014-09-16 20:49     ` Jeff Moyer
2014-09-17 14:49 ` [RFC 1/2] aio: async readahead Benjamin LaHaise
2014-09-17 15:26   ` [RFC 2/2] ext4: async readpage for indirect style inodes Benjamin LaHaise
2014-09-19 11:26   ` [RFC 1/2] aio: async readahead Christoph Hellwig
2014-09-19 16:01     ` Benjamin LaHaise
2014-09-17 22:20 ` [RFC v2 0/5] Non-blockling buffered fs read (page cache only) Milosz Tanski
2014-09-17 22:20   ` [RFC v2 1/5] Prepare for adding a new readv/writev with user flags Milosz Tanski
2014-09-17 22:20   ` [RFC v2 2/5] Define new syscalls readv2,preadv2,writev2,pwritev2 Milosz Tanski
2014-09-18 18:48     ` Darrick J. Wong
2014-09-19 10:52       ` Christoph Hellwig
2014-09-20  0:19         ` Darrick J. Wong
2014-09-17 22:20   ` [RFC v2 3/5] Export new vector IO (with flags) to userland Milosz Tanski
2014-09-17 22:20   ` [RFC v2 4/5] O_NONBLOCK flag for readv2/preadv2 Milosz Tanski
2014-09-19 11:27     ` Christoph Hellwig
2014-09-19 11:59       ` Milosz Tanski
2014-09-22 17:12     ` Jeff Moyer
2014-09-17 22:20   ` [RFC v2 5/5] Check for O_NONBLOCK in all read_iter instances Milosz Tanski
2014-09-19 11:26     ` Christoph Hellwig
2014-09-19 14:42   ` [RFC v2 0/5] Non-blockling buffered fs read (page cache only) Jonathan Corbet
2014-09-19 16:13     ` Volker Lendecke
2014-09-19 17:19     ` Milosz Tanski
2014-09-19 17:33     ` Milosz Tanski
2014-09-22 14:12       ` Jonathan Corbet
2014-09-22 14:24         ` Jeff Moyer
2014-09-22 14:25         ` Christoph Hellwig
2014-09-22 14:30         ` Milosz Tanski
2014-09-24 21:46 ` [RFC v3 0/4] vfs: " Milosz Tanski
2014-09-24 21:46   ` [RFC v3 1/4] vfs: Prepare for adding a new preadv/pwritev with user flags Milosz Tanski
2014-09-24 21:46   ` [RFC v3 2/4] vfs: Define new syscalls preadv2,pwritev2 Milosz Tanski
2014-09-24 21:46   ` [RFC v3 3/4] vfs: Export new vector IO syscalls (with flags) to userland Milosz Tanski
2014-09-24 21:46   ` [RFC v3 4/4] vfs: RWF_NONBLOCK flag for preadv2 Milosz Tanski
2014-09-25  4:06   ` [RFC v3 0/4] vfs: Non-blockling buffered fs read (page cache only) Michael Kerrisk
2014-09-25 11:16     ` Jan Kara
2014-09-25 15:48     ` Milosz Tanski
2014-10-08  2:53   ` Milosz Tanski

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=94D0CD8314A33A4D9D801C0FE68B402958C9102B@G9W0745.americas.hpqcorp.net \
    --to=elliott@hp.com \
    --cc=Volker.Lendecke@sernet.de \
    --cc=adilger@dilger.ca \
    --cc=hch@infradead.org \
    --cc=jmoyer@redhat.com \
    --cc=linux-aio@kvack.org \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mgorman@suse.de \
    --cc=milosz@adfin.com \
    --cc=tj@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).