linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Benjamin LaHaise <bcrl@kvack.org>
To: Christoph Hellwig <hch@infradead.org>
Cc: Milosz Tanski <milosz@adfin.com>,
	linux-kernel@vger.kernel.org, linux-fsdevel@vger.kernel.org,
	linux-aio@kvack.org, Mel Gorman <mgorman@suse.de>,
	Volker Lendecke <Volker.Lendecke@sernet.de>,
	Tejun Heo <tj@kernel.org>, Jeff Moyer <jmoyer@redhat.com>,
	Andreas Dilger <adilger@dilger.ca>
Subject: Re: [RFC 1/2] aio: async readahead
Date: Fri, 19 Sep 2014 12:01:13 -0400	[thread overview]
Message-ID: <20140919160113.GE24821@kvack.org> (raw)
In-Reply-To: <20140919112612.GC4639@infradead.org>

On Fri, Sep 19, 2014 at 04:26:12AM -0700, Christoph Hellwig wrote:
> Requiring the block mappings to be entirely async is why we never went
> for full buffered aio.  What would seem more useful is to offload all
> readahead to workqueues to make sure they never block the caller for
> sys_readahead or if we decide to readahead for the nonblocking read.

I can appreciate that it may be difficult for some filesystems to implement 
a fully asynchronous readpage, but at least for some, it is possible 
and not too difficult.

> I tried to implement this, but I couldn't find a good place to hang
> the work_struct for it off.  If we decide to dynamically allocate
> the ra structure separate from struct file that might be an obvious
> place.

The approach I used in the async ext2/3/4 indirect style metadata readpage 
was to put the async state into the page's memory.  That won't work very 
well on 32 bit systems, but it works well and avoids having to perform 
another memory allocation on 64 bit systems.

I'm still of the opinion that the readpage operation should be started by 
the submitting process.  Some of the work I did in tuning things for my 
employer with async reads found that punting reads to another thread 
caused significant degradation of our workload (basically, reading in a 
bunch of persistent messages from disk, with small messages being an 
important corner of performance).  What ended up being the best performing 
for me was to have an async readahead operation to fill the page cache 
with data from the file, and then to issue a read that was essentially 
non-blocking.  This approach meant that the copy of data from the kernel 
into userspace was performed by the thread that was actually using the 
data.  By doing the copy only once all i/o completed, the data was primed 
in the CPU's cache, allowing the code that actually operates on the data 
to benefit.  Any gradual copy over time ended up performing significantly 
worse.

		-ben
-- 
"Thought is the essence of where you are now."

  reply	other threads:[~2014-09-19 16:01 UTC|newest]

Thread overview: 86+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2014-09-15 20:20 [RFC PATCH 0/7] Non-blockling buffered fs read (page cache only) Milosz Tanski
2014-09-15 20:20 ` [PATCH 1/7] Prepare for adding a new readv/writev with user flags Milosz Tanski
2014-09-15 20:28   ` Al Viro
2014-09-15 21:15     ` Christoph Hellwig
2014-09-15 21:44       ` Milosz Tanski
2014-09-15 20:20 ` [PATCH 2/7] Define new syscalls readv2,preadv2,writev2,pwritev2 Milosz Tanski
2014-09-16 19:20   ` Jeff Moyer
2014-09-16 19:54     ` Milosz Tanski
2014-09-16 21:03     ` Christoph Hellwig
2014-09-17 15:43   ` Theodore Ts'o
2014-09-17 16:05     ` Milosz Tanski
2014-09-17 16:59       ` Theodore Ts'o
2014-09-17 17:24         ` Zach Brown
2014-09-15 20:20 ` [PATCH 3/7] Export new vector IO (with flags) to userland Milosz Tanski
2014-09-15 20:21 ` [PATCH 4/7] O_NONBLOCK flag for readv2/preadv2 Milosz Tanski
2014-09-16 19:19   ` Jeff Moyer
2014-09-16 19:44     ` Milosz Tanski
2014-09-16 19:53       ` Jeff Moyer
2014-09-15 20:21 ` [PATCH 5/7] documentation updates Christoph Hellwig
2014-09-15 20:21 ` [PATCH 6/7] move flags enforcement to vfs_preadv/vfs_pwritev Christoph Hellwig
2014-09-15 21:15   ` Christoph Hellwig
2014-09-15 21:45     ` Milosz Tanski
2014-09-15 20:22 ` [PATCH 7/7] check for O_NONBLOCK in all read_iter instances Christoph Hellwig
2014-09-16 19:27   ` Jeff Moyer
2014-09-16 19:45     ` Milosz Tanski
2014-09-16 21:42       ` Dave Chinner
2014-09-17 12:24         ` Benjamin LaHaise
2014-09-17 13:47           ` Theodore Ts'o
2014-09-17 13:56             ` Benjamin LaHaise
2014-09-17 15:33               ` Milosz Tanski
2014-09-17 15:49                 ` Theodore Ts'o
2014-09-17 15:52               ` Zach Brown
2014-09-16 21:04     ` Christoph Hellwig
2014-09-16 21:24       ` Jeff Moyer
2014-09-15 20:27 ` [RFC PATCH 0/7] Non-blockling buffered fs read (page cache only) Milosz Tanski
2014-09-15 21:33 ` Andreas Dilger
2014-09-15 22:13   ` Milosz Tanski
2014-09-15 22:36   ` Elliott, Robert (Server Storage)
2014-09-16 18:24     ` Zach Brown
2014-09-19 11:21     ` Christoph Hellwig
2014-09-22 15:48       ` Jeff Moyer
2014-09-22 16:32         ` Milosz Tanski
2014-09-22 16:42           ` Christoph Hellwig
2014-09-22 17:02             ` Milosz Tanski
2014-09-22 16:25       ` Elliott, Robert (Server Storage)
2014-09-15 21:58 ` Jeff Moyer
2014-09-15 22:27   ` Milosz Tanski
2014-09-16 13:44     ` Jeff Moyer
2014-09-19 11:23   ` Christoph Hellwig
2014-09-16 19:30 ` Jeff Moyer
2014-09-16 20:34   ` Milosz Tanski
2014-09-16 20:49     ` Jeff Moyer
2014-09-17 14:49 ` [RFC 1/2] aio: async readahead Benjamin LaHaise
2014-09-17 15:26   ` [RFC 2/2] ext4: async readpage for indirect style inodes Benjamin LaHaise
2014-09-19 11:26   ` [RFC 1/2] aio: async readahead Christoph Hellwig
2014-09-19 16:01     ` Benjamin LaHaise [this message]
2014-09-17 22:20 ` [RFC v2 0/5] Non-blockling buffered fs read (page cache only) Milosz Tanski
2014-09-17 22:20   ` [RFC v2 1/5] Prepare for adding a new readv/writev with user flags Milosz Tanski
2014-09-17 22:20   ` [RFC v2 2/5] Define new syscalls readv2,preadv2,writev2,pwritev2 Milosz Tanski
2014-09-18 18:48     ` Darrick J. Wong
2014-09-19 10:52       ` Christoph Hellwig
2014-09-20  0:19         ` Darrick J. Wong
2014-09-17 22:20   ` [RFC v2 3/5] Export new vector IO (with flags) to userland Milosz Tanski
2014-09-17 22:20   ` [RFC v2 4/5] O_NONBLOCK flag for readv2/preadv2 Milosz Tanski
2014-09-19 11:27     ` Christoph Hellwig
2014-09-19 11:59       ` Milosz Tanski
2014-09-22 17:12     ` Jeff Moyer
2014-09-17 22:20   ` [RFC v2 5/5] Check for O_NONBLOCK in all read_iter instances Milosz Tanski
2014-09-19 11:26     ` Christoph Hellwig
2014-09-19 14:42   ` [RFC v2 0/5] Non-blockling buffered fs read (page cache only) Jonathan Corbet
2014-09-19 16:13     ` Volker Lendecke
2014-09-19 17:19     ` Milosz Tanski
2014-09-19 17:33     ` Milosz Tanski
2014-09-22 14:12       ` Jonathan Corbet
2014-09-22 14:24         ` Jeff Moyer
2014-09-22 14:25         ` Christoph Hellwig
2014-09-22 14:30         ` Milosz Tanski
2014-09-24 21:46 ` [RFC v3 0/4] vfs: " Milosz Tanski
2014-09-24 21:46   ` [RFC v3 1/4] vfs: Prepare for adding a new preadv/pwritev with user flags Milosz Tanski
2014-09-24 21:46   ` [RFC v3 2/4] vfs: Define new syscalls preadv2,pwritev2 Milosz Tanski
2014-09-24 21:46   ` [RFC v3 3/4] vfs: Export new vector IO syscalls (with flags) to userland Milosz Tanski
2014-09-24 21:46   ` [RFC v3 4/4] vfs: RWF_NONBLOCK flag for preadv2 Milosz Tanski
2014-09-25  4:06   ` [RFC v3 0/4] vfs: Non-blockling buffered fs read (page cache only) Michael Kerrisk
2014-09-25 11:16     ` Jan Kara
2014-09-25 15:48     ` Milosz Tanski
2014-10-08  2:53   ` Milosz Tanski

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20140919160113.GE24821@kvack.org \
    --to=bcrl@kvack.org \
    --cc=Volker.Lendecke@sernet.de \
    --cc=adilger@dilger.ca \
    --cc=hch@infradead.org \
    --cc=jmoyer@redhat.com \
    --cc=linux-aio@kvack.org \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mgorman@suse.de \
    --cc=milosz@adfin.com \
    --cc=tj@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).