All of lore.kernel.org
 help / color / mirror / Atom feed
From: Jeremy Allison <jra@samba.org>
To: Andrew Morton <akpm@linux-foundation.org>
Cc: Christoph Hellwig <hch@infradead.org>,
	Milosz Tanski <milosz@adfin.com>,
	linux-kernel@vger.kernel.org, linux-fsdevel@vger.kernel.org,
	linux-aio@kvack.org, Mel Gorman <mgorman@suse.de>,
	Volker Lendecke <Volker.Lendecke@sernet.de>,
	Tejun Heo <tj@kernel.org>, Jeff Moyer <jmoyer@redhat.com>,
	"Theodore Ts'o" <tytso@mit.edu>,
	Al Viro <viro@zeniv.linux.org.uk>,
	linux-api@vger.kernel.org,
	Michael Kerrisk <mtk.manpages@gmail.com>,
	linux-arch@vger.kernel.org, Dave Chinner <david@fromorbit.com>
Subject: Re: [PATCH v7 0/5] vfs: Non-blockling buffered fs read (page cache only)
Date: Fri, 27 Mar 2015 08:58:54 -0700	[thread overview]
Message-ID: <20150327155854.GA5548@samba2> (raw)
In-Reply-To: <20150327020159.eadd0ce1.akpm@linux-foundation.org>

On Fri, Mar 27, 2015 at 02:01:59AM -0700, Andrew Morton wrote:
> On Fri, 27 Mar 2015 01:48:33 -0700 Christoph Hellwig <hch@infradead.org> wrote:
> 
> > On Fri, Mar 27, 2015 at 01:35:16AM -0700, Andrew Morton wrote:
> > > fincore() doesn't have to be ugly.  Please address the design issues I
> > > raised.  How is pread2() useful to the class of applications which
> > > cannot proceed until all data is available?
> > 
> > It actually makes them work correctly?  preadv2( ..., DONTWAIT) will
> > return -EGAIN, which causes them to bounce to the threadpool where
> > they call preadv(...).
> 
> (I assume you mean RWF_NONBLOCK)
> 
> That isn't how pread2() works.  If the leading one or more pages are
> uptodate, pread2() will return a partial read.  Now what?  Either the
> application reads the same data a second time via the worker thread
> (dumb, but it will usually be a rare case)

The problem with the above is that we can't tell the difference
between pread2() returning a short read because the pages are not
in cache, or because someone truncated the file. So we need some
way to differentiate this.

My preference from userspace would be for pread2() to return
EAGAIN if *all* the data requested is not available (where
'all' can be less than the size requested if the file has
been truncated in the meantime).

So:

ret = pread2(fd, buf, size_wanted, RWF_NONBLOCK)

if (ret == -1) {
	if (errno == EAGAIN) {
		goto threadpool...
	}
	.. real error..
}

if (ret == size_wanted) {
	.. normal read, file not truncated...
}

if (ret < size_wanted) {
	.. file was truncated..
}

The thing I want to avoid is the case where
ret < size_wanted means only part of the file
is in cache.

WARNING: multiple messages have this Message-ID (diff)
From: Jeremy Allison <jra@samba.org>
To: Andrew Morton <akpm@linux-foundation.org>
Cc: Christoph Hellwig <hch@infradead.org>,
	Milosz Tanski <milosz@adfin.com>,
	linux-kernel@vger.kernel.org, linux-fsdevel@vger.kernel.org,
	linux-aio@kvack.org, Mel Gorman <mgorman@suse.de>,
	Volker Lendecke <Volker.Lendecke@sernet.de>,
	Tejun Heo <tj@kernel.org>, Jeff Moyer <jmoyer@redhat.com>,
	Theodore Ts'o <tytso@mit.edu>, Al Viro <viro@zeniv.linux.org.uk>,
	linux-api@vger.kernel.org,
	Michael Kerrisk <mtk.manpages@gmail.com>,
	linux-arch@vger.kernel.org, Dave Chinner <david@fromorbit.com>
Subject: Re: [PATCH v7 0/5] vfs: Non-blockling buffered fs read (page cache only)
Date: Fri, 27 Mar 2015 08:58:54 -0700	[thread overview]
Message-ID: <20150327155854.GA5548@samba2> (raw)
In-Reply-To: <20150327020159.eadd0ce1.akpm@linux-foundation.org>

On Fri, Mar 27, 2015 at 02:01:59AM -0700, Andrew Morton wrote:
> On Fri, 27 Mar 2015 01:48:33 -0700 Christoph Hellwig <hch@infradead.org> wrote:
> 
> > On Fri, Mar 27, 2015 at 01:35:16AM -0700, Andrew Morton wrote:
> > > fincore() doesn't have to be ugly.  Please address the design issues I
> > > raised.  How is pread2() useful to the class of applications which
> > > cannot proceed until all data is available?
> > 
> > It actually makes them work correctly?  preadv2( ..., DONTWAIT) will
> > return -EGAIN, which causes them to bounce to the threadpool where
> > they call preadv(...).
> 
> (I assume you mean RWF_NONBLOCK)
> 
> That isn't how pread2() works.  If the leading one or more pages are
> uptodate, pread2() will return a partial read.  Now what?  Either the
> application reads the same data a second time via the worker thread
> (dumb, but it will usually be a rare case)

The problem with the above is that we can't tell the difference
between pread2() returning a short read because the pages are not
in cache, or because someone truncated the file. So we need some
way to differentiate this.

My preference from userspace would be for pread2() to return
EAGAIN if *all* the data requested is not available (where
'all' can be less than the size requested if the file has
been truncated in the meantime).

So:

ret = pread2(fd, buf, size_wanted, RWF_NONBLOCK)

if (ret == -1) {
	if (errno == EAGAIN) {
		goto threadpool...
	}
	.. real error..
}

if (ret == size_wanted) {
	.. normal read, file not truncated...
}

if (ret < size_wanted) {
	.. file was truncated..
}

The thing I want to avoid is the case where
ret < size_wanted means only part of the file
is in cache.

--
To unsubscribe, send a message with 'unsubscribe linux-aio' in
the body to majordomo@kvack.org.  For more info on Linux AIO,
see: http://www.kvack.org/aio/
Don't email: <a href=mailto:"aart@kvack.org">aart@kvack.org</a>

  parent reply	other threads:[~2015-03-27 16:07 UTC|newest]

Thread overview: 94+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2015-03-16 18:27 [PATCH v7 0/5] vfs: Non-blockling buffered fs read (page cache only) Milosz Tanski
2015-03-16 18:27 ` [PATCH v7 1/5] vfs: Prepare for adding a new preadv/pwritev with user flags Milosz Tanski
2015-03-16 18:27   ` Milosz Tanski
2015-03-16 21:05   ` Andreas Dilger
2015-03-16 21:05     ` Andreas Dilger
2015-03-16 18:27 ` [PATCH v7 2/5] vfs: Define new syscalls preadv2,pwritev2 Milosz Tanski
2015-03-16 18:27   ` Milosz Tanski
2015-03-16 18:27 ` [PATCH v7 3/5] x86: wire up preadv2 and pwritev2 Milosz Tanski
2015-03-16 18:27   ` Milosz Tanski
2015-03-16 18:27 ` [PATCH v7 4/5] vfs: RWF_NONBLOCK flag for preadv2 Milosz Tanski
2015-03-16 18:27   ` Milosz Tanski
2015-03-16 18:27 ` [PATCH v7 5/5] xfs: add RWF_NONBLOCK support Milosz Tanski
2015-03-16 18:27   ` Milosz Tanski
2015-03-16 22:04   ` Dave Chinner
2015-03-16 18:32 ` [PATCH] Add preadv2/pwritev2 documentation Milosz Tanski
2015-03-27 16:49   ` Andrew Morton
2015-03-30  7:33     ` Christoph Hellwig
2015-03-30  7:33       ` Christoph Hellwig
2015-03-16 18:34 ` [PATCH] fstests: generic test for preadv2 behavior on linux Milosz Tanski
2015-03-16 18:34   ` Milosz Tanski
2015-03-16 21:07   ` Andreas Dilger
2015-03-16 21:07     ` Andreas Dilger
2015-03-16 22:03     ` Milosz Tanski
2015-03-16 22:02   ` Dave Chinner
2015-03-16 22:02     ` Dave Chinner
2015-03-16 22:11     ` Milosz Tanski
2015-03-16 22:56       ` Dave Chinner
2015-03-16 22:56         ` Dave Chinner
2015-03-26 11:55 ` [PATCH v7 0/5] vfs: Non-blockling buffered fs read (page cache only) Christoph Hellwig
2015-03-26 11:55   ` Christoph Hellwig
2015-03-26 19:12   ` Milosz Tanski
2015-03-26 19:12     ` Milosz Tanski
2015-03-27  2:26     ` Milosz Tanski
2015-03-27  2:29     ` Milosz Tanski
2015-03-27  2:29       ` Milosz Tanski
2015-03-27  3:28 ` Andrew Morton
2015-03-27  3:28   ` Andrew Morton
2015-03-27  5:41   ` Volker Lendecke
2015-03-27  5:41     ` Volker Lendecke
2015-03-27  6:08     ` Andrew Morton
2015-03-27  6:08       ` Andrew Morton
2015-03-27  8:02       ` Volker Lendecke
2015-03-27  8:02         ` Volker Lendecke
2015-03-27  8:12         ` Christoph Hellwig
2015-03-27  8:18   ` Christoph Hellwig
2015-03-27  8:18     ` Christoph Hellwig
2015-03-27  8:35     ` Andrew Morton
2015-03-27  8:35       ` Andrew Morton
2015-03-27  8:48       ` Christoph Hellwig
2015-03-27  9:01         ` Andrew Morton
2015-03-27  9:01           ` Andrew Morton
2015-03-27  9:44           ` Volker Lendecke
2015-03-27 15:58           ` Jeremy Allison [this message]
2015-03-27 15:58             ` Jeremy Allison
2015-03-27 16:30             ` Andrew Morton
2015-03-27 16:30               ` Andrew Morton
2015-03-27 16:30               ` Andrew Morton
2015-03-27 16:30               ` Andrew Morton
2015-03-27 16:39               ` Jeremy Allison
2015-03-27 16:39                 ` Jeremy Allison
2015-03-27 16:39               ` Andrew Morton
2015-03-27 16:45               ` Milosz Tanski
2015-03-31  1:27               ` Milosz Tanski
2015-03-27 16:38             ` Milosz Tanski
2015-03-27 16:38               ` Milosz Tanski
2015-03-30  7:36             ` Christoph Hellwig
2015-03-30 17:19               ` Jeremy Allison
2015-03-30 17:19                 ` Jeremy Allison
2015-03-30 22:51                 ` Milosz Tanski
2015-03-30 20:26               ` Andrew Morton
2015-03-30 20:26                 ` Andrew Morton
2015-03-30 20:32                 ` Jeremy Allison
2015-03-30 20:37                   ` Andrew Morton
2015-03-30 20:49                     ` Jeremy Allison
2015-03-30 21:33                       ` Andrew Morton
2015-03-30 22:35                     ` Milosz Tanski
2015-03-30 22:49                   ` Milosz Tanski
2015-03-30 22:57                     ` Andrew Morton
2015-03-30 23:06                       ` Milosz Tanski
2015-03-30 23:06                         ` Milosz Tanski
2015-03-30 23:25                 ` Milosz Tanski
2015-04-04  3:42                 ` Andrew Morton
2015-04-06  3:53                   ` Milosz Tanski
2015-04-06  3:53                     ` Milosz Tanski
2015-03-30 23:09               ` Milosz Tanski
2015-03-27 15:21   ` Milosz Tanski
2015-03-27 15:21     ` Milosz Tanski
2015-03-27 17:04     ` Andrew Morton
2015-03-30  7:40       ` Christoph Hellwig
2015-03-30  7:40         ` Christoph Hellwig
2015-03-30 18:54         ` Andrew Morton
2015-03-30 22:40           ` Milosz Tanski
2015-03-30 22:50             ` Andrew Morton
2015-03-30 22:50               ` Andrew Morton

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20150327155854.GA5548@samba2 \
    --to=jra@samba.org \
    --cc=Volker.Lendecke@sernet.de \
    --cc=akpm@linux-foundation.org \
    --cc=david@fromorbit.com \
    --cc=hch@infradead.org \
    --cc=jmoyer@redhat.com \
    --cc=linux-aio@kvack.org \
    --cc=linux-api@vger.kernel.org \
    --cc=linux-arch@vger.kernel.org \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mgorman@suse.de \
    --cc=milosz@adfin.com \
    --cc=mtk.manpages@gmail.com \
    --cc=tj@kernel.org \
    --cc=tytso@mit.edu \
    --cc=viro@zeniv.linux.org.uk \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.