All of lore.kernel.org
 help / color / mirror / Atom feed
From: Volker Lendecke <Volker.Lendecke@SerNet.DE>
To: Andrew Morton <akpm@linux-foundation.org>
Cc: Milosz Tanski <milosz@adfin.com>,
	linux-kernel@vger.kernel.org,
	Christoph Hellwig <hch@infradead.org>,
	linux-fsdevel@vger.kernel.org, linux-aio@kvack.org,
	Mel Gorman <mgorman@suse.de>, Tejun Heo <tj@kernel.org>,
	Jeff Moyer <jmoyer@redhat.com>, "Theodore Ts'o" <tytso@mit.edu>,
	Al Viro <viro@zeniv.linux.org.uk>,
	linux-api@vger.kernel.org,
	Michael Kerrisk <mtk.manpages@gmail.com>,
	linux-arch@vger.kernel.org, Dave Chinner <david@fromorbit.com>
Subject: Re: [PATCH v7 0/5] vfs: Non-blockling buffered fs read (page cache only)
Date: Fri, 27 Mar 2015 06:41:25 +0100	[thread overview]
Message-ID: <E1YbN1J-0084qO-3s@intern.SerNet.DE> (raw)
In-Reply-To: <20150326202824.65d03787.akpm@linux-foundation.org>

On Thu, Mar 26, 2015 at 08:28:24PM -0700, Andrew Morton wrote:
> A thing which bugs me about pread2() is that it is specifically
> tailored to applications which are able to use a partial read result. 
> ie, by sending it over the network.

Can you explain what you mean by this? Samba gets a pread
request from a client for some bytes. The client will be
confused when we send less than requested although the file
is long enough to satisfy all.

> And of course fincore could be used by Samba etc to avoid blocking on
> reads.  It wouldn't perform quite as well as pread2(), but I bet it's
> good enough.
> 
> Bottom line: with pread2() there's still a need for fincore(), but with
> fincore() there probably isn't a need for pread2().

fincore would be a second syscall per pread, and it is not
atomic. I've had discussions with MIPS based vendors who
are worried about every single syscall. This is the #1
hottest code path in Samba.

> And I'm doubtful about claims that it absolutely has to be non-blocking
> 100% of the time.  I bet that 99.99% is good enough.  A fincore()
> option to run mark_page_accessed() against present pages would help
> with the race-with-reclaim situation.

If you can make sure that after an fincore the pages remain
in memory for x milliseconds the atomicity concern might go
away.

Volker

-- 
SerNet GmbH, Bahnhofsallee 1b, 37081 Göttingen
phone: +49-551-370000-0, fax: +49-551-370000-9
AG Göttingen, HRB 2816, GF: Dr. Johannes Loxen
http://www.sernet.de, mailto:kontakt@sernet.de

WARNING: multiple messages have this Message-ID (diff)
From: Volker Lendecke <Volker.Lendecke@SerNet.DE>
To: Andrew Morton <akpm@linux-foundation.org>
Cc: Milosz Tanski <milosz@adfin.com>,
	linux-kernel@vger.kernel.org,
	Christoph Hellwig <hch@infradead.org>,
	linux-fsdevel@vger.kernel.org, linux-aio@kvack.org,
	Mel Gorman <mgorman@suse.de>, Tejun Heo <tj@kernel.org>,
	Jeff Moyer <jmoyer@redhat.com>, Theodore Ts'o <tytso@mit.edu>,
	Al Viro <viro@zeniv.linux.org.uk>,
	linux-api@vger.kernel.org,
	Michael Kerrisk <mtk.manpages@gmail.com>,
	linux-arch@vger.kernel.org, Dave Chinner <david@fromorbit.com>
Subject: Re: [PATCH v7 0/5] vfs: Non-blockling buffered fs read (page cache only)
Date: Fri, 27 Mar 2015 06:41:25 +0100	[thread overview]
Message-ID: <E1YbN1J-0084qO-3s@intern.SerNet.DE> (raw)
In-Reply-To: <20150326202824.65d03787.akpm@linux-foundation.org>

On Thu, Mar 26, 2015 at 08:28:24PM -0700, Andrew Morton wrote:
> A thing which bugs me about pread2() is that it is specifically
> tailored to applications which are able to use a partial read result. 
> ie, by sending it over the network.

Can you explain what you mean by this? Samba gets a pread
request from a client for some bytes. The client will be
confused when we send less than requested although the file
is long enough to satisfy all.

> And of course fincore could be used by Samba etc to avoid blocking on
> reads.  It wouldn't perform quite as well as pread2(), but I bet it's
> good enough.
> 
> Bottom line: with pread2() there's still a need for fincore(), but with
> fincore() there probably isn't a need for pread2().

fincore would be a second syscall per pread, and it is not
atomic. I've had discussions with MIPS based vendors who
are worried about every single syscall. This is the #1
hottest code path in Samba.

> And I'm doubtful about claims that it absolutely has to be non-blocking
> 100% of the time.  I bet that 99.99% is good enough.  A fincore()
> option to run mark_page_accessed() against present pages would help
> with the race-with-reclaim situation.

If you can make sure that after an fincore the pages remain
in memory for x milliseconds the atomicity concern might go
away.

Volker

-- 
SerNet GmbH, Bahnhofsallee 1b, 37081 Göttingen
phone: +49-551-370000-0, fax: +49-551-370000-9
AG Göttingen, HRB 2816, GF: Dr. Johannes Loxen
http://www.sernet.de, mailto:kontakt@sernet.de

--
To unsubscribe, send a message with 'unsubscribe linux-aio' in
the body to majordomo@kvack.org.  For more info on Linux AIO,
see: http://www.kvack.org/aio/
Don't email: <a href=mailto:"aart@kvack.org">aart@kvack.org</a>

  reply	other threads:[~2015-03-27  6:04 UTC|newest]

Thread overview: 94+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2015-03-16 18:27 [PATCH v7 0/5] vfs: Non-blockling buffered fs read (page cache only) Milosz Tanski
2015-03-16 18:27 ` [PATCH v7 1/5] vfs: Prepare for adding a new preadv/pwritev with user flags Milosz Tanski
2015-03-16 18:27   ` Milosz Tanski
2015-03-16 21:05   ` Andreas Dilger
2015-03-16 21:05     ` Andreas Dilger
2015-03-16 18:27 ` [PATCH v7 2/5] vfs: Define new syscalls preadv2,pwritev2 Milosz Tanski
2015-03-16 18:27   ` Milosz Tanski
2015-03-16 18:27 ` [PATCH v7 3/5] x86: wire up preadv2 and pwritev2 Milosz Tanski
2015-03-16 18:27   ` Milosz Tanski
2015-03-16 18:27 ` [PATCH v7 4/5] vfs: RWF_NONBLOCK flag for preadv2 Milosz Tanski
2015-03-16 18:27   ` Milosz Tanski
2015-03-16 18:27 ` [PATCH v7 5/5] xfs: add RWF_NONBLOCK support Milosz Tanski
2015-03-16 18:27   ` Milosz Tanski
2015-03-16 22:04   ` Dave Chinner
2015-03-16 18:32 ` [PATCH] Add preadv2/pwritev2 documentation Milosz Tanski
2015-03-27 16:49   ` Andrew Morton
2015-03-30  7:33     ` Christoph Hellwig
2015-03-30  7:33       ` Christoph Hellwig
2015-03-16 18:34 ` [PATCH] fstests: generic test for preadv2 behavior on linux Milosz Tanski
2015-03-16 18:34   ` Milosz Tanski
2015-03-16 21:07   ` Andreas Dilger
2015-03-16 21:07     ` Andreas Dilger
2015-03-16 22:03     ` Milosz Tanski
2015-03-16 22:02   ` Dave Chinner
2015-03-16 22:02     ` Dave Chinner
2015-03-16 22:11     ` Milosz Tanski
2015-03-16 22:56       ` Dave Chinner
2015-03-16 22:56         ` Dave Chinner
2015-03-26 11:55 ` [PATCH v7 0/5] vfs: Non-blockling buffered fs read (page cache only) Christoph Hellwig
2015-03-26 11:55   ` Christoph Hellwig
2015-03-26 19:12   ` Milosz Tanski
2015-03-26 19:12     ` Milosz Tanski
2015-03-27  2:26     ` Milosz Tanski
2015-03-27  2:29     ` Milosz Tanski
2015-03-27  2:29       ` Milosz Tanski
2015-03-27  3:28 ` Andrew Morton
2015-03-27  3:28   ` Andrew Morton
2015-03-27  5:41   ` Volker Lendecke [this message]
2015-03-27  5:41     ` Volker Lendecke
2015-03-27  6:08     ` Andrew Morton
2015-03-27  6:08       ` Andrew Morton
2015-03-27  8:02       ` Volker Lendecke
2015-03-27  8:02         ` Volker Lendecke
2015-03-27  8:12         ` Christoph Hellwig
2015-03-27  8:18   ` Christoph Hellwig
2015-03-27  8:18     ` Christoph Hellwig
2015-03-27  8:35     ` Andrew Morton
2015-03-27  8:35       ` Andrew Morton
2015-03-27  8:48       ` Christoph Hellwig
2015-03-27  9:01         ` Andrew Morton
2015-03-27  9:01           ` Andrew Morton
2015-03-27  9:44           ` Volker Lendecke
2015-03-27 15:58           ` Jeremy Allison
2015-03-27 15:58             ` Jeremy Allison
2015-03-27 16:30             ` Andrew Morton
2015-03-27 16:30               ` Andrew Morton
2015-03-27 16:30               ` Andrew Morton
2015-03-27 16:30               ` Andrew Morton
2015-03-27 16:39               ` Jeremy Allison
2015-03-27 16:39                 ` Jeremy Allison
2015-03-27 16:39               ` Andrew Morton
2015-03-27 16:45               ` Milosz Tanski
2015-03-31  1:27               ` Milosz Tanski
2015-03-27 16:38             ` Milosz Tanski
2015-03-27 16:38               ` Milosz Tanski
2015-03-30  7:36             ` Christoph Hellwig
2015-03-30 17:19               ` Jeremy Allison
2015-03-30 17:19                 ` Jeremy Allison
2015-03-30 22:51                 ` Milosz Tanski
2015-03-30 20:26               ` Andrew Morton
2015-03-30 20:26                 ` Andrew Morton
2015-03-30 20:32                 ` Jeremy Allison
2015-03-30 20:37                   ` Andrew Morton
2015-03-30 20:49                     ` Jeremy Allison
2015-03-30 21:33                       ` Andrew Morton
2015-03-30 22:35                     ` Milosz Tanski
2015-03-30 22:49                   ` Milosz Tanski
2015-03-30 22:57                     ` Andrew Morton
2015-03-30 23:06                       ` Milosz Tanski
2015-03-30 23:06                         ` Milosz Tanski
2015-03-30 23:25                 ` Milosz Tanski
2015-04-04  3:42                 ` Andrew Morton
2015-04-06  3:53                   ` Milosz Tanski
2015-04-06  3:53                     ` Milosz Tanski
2015-03-30 23:09               ` Milosz Tanski
2015-03-27 15:21   ` Milosz Tanski
2015-03-27 15:21     ` Milosz Tanski
2015-03-27 17:04     ` Andrew Morton
2015-03-30  7:40       ` Christoph Hellwig
2015-03-30  7:40         ` Christoph Hellwig
2015-03-30 18:54         ` Andrew Morton
2015-03-30 22:40           ` Milosz Tanski
2015-03-30 22:50             ` Andrew Morton
2015-03-30 22:50               ` Andrew Morton

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=E1YbN1J-0084qO-3s@intern.SerNet.DE \
    --to=volker.lendecke@sernet.de \
    --cc=akpm@linux-foundation.org \
    --cc=david@fromorbit.com \
    --cc=hch@infradead.org \
    --cc=jmoyer@redhat.com \
    --cc=linux-aio@kvack.org \
    --cc=linux-api@vger.kernel.org \
    --cc=linux-arch@vger.kernel.org \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mgorman@suse.de \
    --cc=milosz@adfin.com \
    --cc=mtk.manpages@gmail.com \
    --cc=tj@kernel.org \
    --cc=tytso@mit.edu \
    --cc=viro@zeniv.linux.org.uk \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.