From: Jeremy Allison <jra@samba.org> To: Andrew Morton <akpm@linux-foundation.org> Cc: Jeremy Allison <jra@samba.org>, Christoph Hellwig <hch@infradead.org>, Milosz Tanski <milosz@adfin.com>, linux-kernel@vger.kernel.org, linux-fsdevel@vger.kernel.org, linux-aio@kvack.org, Mel Gorman <mgorman@suse.de>, Volker Lendecke <Volker.Lendecke@sernet.de>, Tejun Heo <tj@kernel.org>, Jeff Moyer <jmoyer@redhat.com>, "Theodore Ts'o" <tytso@mit.edu>, Al Viro <viro@zeniv.linux.org.uk>, linux-api@vger.kernel.org, Michael Kerrisk <mtk.manpages@gmail.com>, linux-arch@vger.kernel.org, Dave Chinner <david@fromorbit.com> Subject: Re: [PATCH v7 0/5] vfs: Non-blockling buffered fs read (page cache only) Date: Fri, 27 Mar 2015 09:39:08 -0700 [thread overview] Message-ID: <20150327163908.GB5548@samba2> (raw) In-Reply-To: <20150327093046.53c2769a.akpm@linux-foundation.org> On Fri, Mar 27, 2015 at 09:30:46AM -0700, Andrew Morton wrote: > > But from an interface perspective the behaviour you're asking for is > insane, frankly - if the kernel copied out 8k of data then pread2() > should return 8k. Otherwise there's no way for userspace to know that > the 8k copy actually happened and we have just wasted a great pile of > CPU doing a pointless memcpy. Why would it do the copy in the first place if we asked (for example) for 16k, but only 8k was available ? Just return EAGAIN and have done with it. > I expect that this situation (first part in cache, latter part not in > cache) is rare - for reasonably small requests the common cases will be > "all cached" and "nothing cached". So perhaps the best approach here > is for samba to add special handling for the short read, to work out > the reason for its occurrence. We can do that, but as Volker says this is a very hot code path. > I take it from your comments that nobody has actually wired up pread2() > into samba yet? That's a bit disturbing, because if we later want to > go and change something like this short-read behaviour, we're screwed - > it's a non back-compat userspace-visible change. It's been done as a test, so the code exists and has run (and improved perforamance as I recall). Not much point commiting it without kernel support :-). > And a note on cosmetics: why are we using EAGAIN here rather than > EWOULDBLOCK? They have the same numerical value, but EWOULDBLOCK is a > better name - EAGAIN says "run it again", but that won't work. Sounds good to me !
WARNING: multiple messages have this Message-ID (diff)
From: Jeremy Allison <jra@samba.org> To: Andrew Morton <akpm@linux-foundation.org> Cc: Jeremy Allison <jra@samba.org>, Christoph Hellwig <hch@infradead.org>, Milosz Tanski <milosz@adfin.com>, linux-kernel@vger.kernel.org, linux-fsdevel@vger.kernel.org, linux-aio@kvack.org, Mel Gorman <mgorman@suse.de>, Volker Lendecke <Volker.Lendecke@sernet.de>, Tejun Heo <tj@kernel.org>, Jeff Moyer <jmoyer@redhat.com>, Theodore Ts'o <tytso@mit.edu>, Al Viro <viro@zeniv.linux.org.uk>, linux-api@vger.kernel.org, Michael Kerrisk <mtk.manpages@gmail.com>, linux-arch@vger.kernel.org, Dave Chinner <david@fromorbit.com> Subject: Re: [PATCH v7 0/5] vfs: Non-blockling buffered fs read (page cache only) Date: Fri, 27 Mar 2015 09:39:08 -0700 [thread overview] Message-ID: <20150327163908.GB5548@samba2> (raw) In-Reply-To: <20150327093046.53c2769a.akpm@linux-foundation.org> On Fri, Mar 27, 2015 at 09:30:46AM -0700, Andrew Morton wrote: > > But from an interface perspective the behaviour you're asking for is > insane, frankly - if the kernel copied out 8k of data then pread2() > should return 8k. Otherwise there's no way for userspace to know that > the 8k copy actually happened and we have just wasted a great pile of > CPU doing a pointless memcpy. Why would it do the copy in the first place if we asked (for example) for 16k, but only 8k was available ? Just return EAGAIN and have done with it. > I expect that this situation (first part in cache, latter part not in > cache) is rare - for reasonably small requests the common cases will be > "all cached" and "nothing cached". So perhaps the best approach here > is for samba to add special handling for the short read, to work out > the reason for its occurrence. We can do that, but as Volker says this is a very hot code path. > I take it from your comments that nobody has actually wired up pread2() > into samba yet? That's a bit disturbing, because if we later want to > go and change something like this short-read behaviour, we're screwed - > it's a non back-compat userspace-visible change. It's been done as a test, so the code exists and has run (and improved perforamance as I recall). Not much point commiting it without kernel support :-). > And a note on cosmetics: why are we using EAGAIN here rather than > EWOULDBLOCK? They have the same numerical value, but EWOULDBLOCK is a > better name - EAGAIN says "run it again", but that won't work. Sounds good to me ! -- To unsubscribe, send a message with 'unsubscribe linux-aio' in the body to majordomo@kvack.org. For more info on Linux AIO, see: http://www.kvack.org/aio/ Don't email: <a href=mailto:"aart@kvack.org">aart@kvack.org</a>
next prev parent reply other threads:[~2015-03-27 16:39 UTC|newest] Thread overview: 94+ messages / expand[flat|nested] mbox.gz Atom feed top 2015-03-16 18:27 [PATCH v7 0/5] vfs: Non-blockling buffered fs read (page cache only) Milosz Tanski 2015-03-16 18:27 ` [PATCH v7 1/5] vfs: Prepare for adding a new preadv/pwritev with user flags Milosz Tanski 2015-03-16 18:27 ` Milosz Tanski 2015-03-16 21:05 ` Andreas Dilger 2015-03-16 21:05 ` Andreas Dilger 2015-03-16 18:27 ` [PATCH v7 2/5] vfs: Define new syscalls preadv2,pwritev2 Milosz Tanski 2015-03-16 18:27 ` Milosz Tanski 2015-03-16 18:27 ` [PATCH v7 3/5] x86: wire up preadv2 and pwritev2 Milosz Tanski 2015-03-16 18:27 ` Milosz Tanski 2015-03-16 18:27 ` [PATCH v7 4/5] vfs: RWF_NONBLOCK flag for preadv2 Milosz Tanski 2015-03-16 18:27 ` Milosz Tanski 2015-03-16 18:27 ` [PATCH v7 5/5] xfs: add RWF_NONBLOCK support Milosz Tanski 2015-03-16 18:27 ` Milosz Tanski 2015-03-16 22:04 ` Dave Chinner 2015-03-16 18:32 ` [PATCH] Add preadv2/pwritev2 documentation Milosz Tanski 2015-03-27 16:49 ` Andrew Morton 2015-03-30 7:33 ` Christoph Hellwig 2015-03-30 7:33 ` Christoph Hellwig 2015-03-16 18:34 ` [PATCH] fstests: generic test for preadv2 behavior on linux Milosz Tanski 2015-03-16 18:34 ` Milosz Tanski 2015-03-16 21:07 ` Andreas Dilger 2015-03-16 21:07 ` Andreas Dilger 2015-03-16 22:03 ` Milosz Tanski 2015-03-16 22:02 ` Dave Chinner 2015-03-16 22:02 ` Dave Chinner 2015-03-16 22:11 ` Milosz Tanski 2015-03-16 22:56 ` Dave Chinner 2015-03-16 22:56 ` Dave Chinner 2015-03-26 11:55 ` [PATCH v7 0/5] vfs: Non-blockling buffered fs read (page cache only) Christoph Hellwig 2015-03-26 11:55 ` Christoph Hellwig 2015-03-26 19:12 ` Milosz Tanski 2015-03-26 19:12 ` Milosz Tanski 2015-03-27 2:26 ` Milosz Tanski 2015-03-27 2:29 ` Milosz Tanski 2015-03-27 2:29 ` Milosz Tanski 2015-03-27 3:28 ` Andrew Morton 2015-03-27 3:28 ` Andrew Morton 2015-03-27 5:41 ` Volker Lendecke 2015-03-27 5:41 ` Volker Lendecke 2015-03-27 6:08 ` Andrew Morton 2015-03-27 6:08 ` Andrew Morton 2015-03-27 8:02 ` Volker Lendecke 2015-03-27 8:02 ` Volker Lendecke 2015-03-27 8:12 ` Christoph Hellwig 2015-03-27 8:18 ` Christoph Hellwig 2015-03-27 8:18 ` Christoph Hellwig 2015-03-27 8:35 ` Andrew Morton 2015-03-27 8:35 ` Andrew Morton 2015-03-27 8:48 ` Christoph Hellwig 2015-03-27 9:01 ` Andrew Morton 2015-03-27 9:01 ` Andrew Morton 2015-03-27 9:44 ` Volker Lendecke 2015-03-27 15:58 ` Jeremy Allison 2015-03-27 15:58 ` Jeremy Allison 2015-03-27 16:30 ` Andrew Morton 2015-03-27 16:30 ` Andrew Morton 2015-03-27 16:30 ` Andrew Morton 2015-03-27 16:30 ` Andrew Morton 2015-03-27 16:39 ` Jeremy Allison [this message] 2015-03-27 16:39 ` Jeremy Allison 2015-03-27 16:39 ` Andrew Morton 2015-03-27 16:45 ` Milosz Tanski 2015-03-31 1:27 ` Milosz Tanski 2015-03-27 16:38 ` Milosz Tanski 2015-03-27 16:38 ` Milosz Tanski 2015-03-30 7:36 ` Christoph Hellwig 2015-03-30 17:19 ` Jeremy Allison 2015-03-30 17:19 ` Jeremy Allison 2015-03-30 22:51 ` Milosz Tanski 2015-03-30 20:26 ` Andrew Morton 2015-03-30 20:26 ` Andrew Morton 2015-03-30 20:32 ` Jeremy Allison 2015-03-30 20:37 ` Andrew Morton 2015-03-30 20:49 ` Jeremy Allison 2015-03-30 21:33 ` Andrew Morton 2015-03-30 22:35 ` Milosz Tanski 2015-03-30 22:49 ` Milosz Tanski 2015-03-30 22:57 ` Andrew Morton 2015-03-30 23:06 ` Milosz Tanski 2015-03-30 23:06 ` Milosz Tanski 2015-03-30 23:25 ` Milosz Tanski 2015-04-04 3:42 ` Andrew Morton 2015-04-06 3:53 ` Milosz Tanski 2015-04-06 3:53 ` Milosz Tanski 2015-03-30 23:09 ` Milosz Tanski 2015-03-27 15:21 ` Milosz Tanski 2015-03-27 15:21 ` Milosz Tanski 2015-03-27 17:04 ` Andrew Morton 2015-03-30 7:40 ` Christoph Hellwig 2015-03-30 7:40 ` Christoph Hellwig 2015-03-30 18:54 ` Andrew Morton 2015-03-30 22:40 ` Milosz Tanski 2015-03-30 22:50 ` Andrew Morton 2015-03-30 22:50 ` Andrew Morton
Reply instructions: You may reply publicly to this message via plain-text email using any one of the following methods: * Save the following mbox file, import it into your mail client, and reply-to-all from there: mbox Avoid top-posting and favor interleaved quoting: https://en.wikipedia.org/wiki/Posting_style#Interleaved_style * Reply using the --to, --cc, and --in-reply-to switches of git-send-email(1): git send-email \ --in-reply-to=20150327163908.GB5548@samba2 \ --to=jra@samba.org \ --cc=Volker.Lendecke@sernet.de \ --cc=akpm@linux-foundation.org \ --cc=david@fromorbit.com \ --cc=hch@infradead.org \ --cc=jmoyer@redhat.com \ --cc=linux-aio@kvack.org \ --cc=linux-api@vger.kernel.org \ --cc=linux-arch@vger.kernel.org \ --cc=linux-fsdevel@vger.kernel.org \ --cc=linux-kernel@vger.kernel.org \ --cc=mgorman@suse.de \ --cc=milosz@adfin.com \ --cc=mtk.manpages@gmail.com \ --cc=tj@kernel.org \ --cc=tytso@mit.edu \ --cc=viro@zeniv.linux.org.uk \ /path/to/YOUR_REPLY https://kernel.org/pub/software/scm/git/docs/git-send-email.html * If your mail client supports setting the In-Reply-To header via mailto: links, try the mailto: linkBe sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.