All of lore.kernel.org
 help / color / mirror / Atom feed
From: Milosz Tanski <milosz@adfin.com>
To: Jeff Moyer <jmoyer@redhat.com>
Cc: LKML <linux-kernel@vger.kernel.org>,
	Christoph Hellwig <hch@infradead.org>,
	"linux-fsdevel@vger.kernel.org" <linux-fsdevel@vger.kernel.org>,
	linux-aio@kvack.org, Mel Gorman <mgorman@suse.de>,
	Volker Lendecke <Volker.Lendecke@sernet.de>,
	Tejun Heo <tj@kernel.org>,
	michael.kerrisk@gmail.com
Subject: Re: [RFC PATCH 0/7] Non-blockling buffered fs read (page cache only)
Date: Mon, 15 Sep 2014 18:27:07 -0400	[thread overview]
Message-ID: <CANP1eJF7b3RFd4NdyCfsFuf9uyhrahVqJkDHs81nd2ZvVDwzSg@mail.gmail.com> (raw)
In-Reply-To: <x49d2awr1ez.fsf@segfault.boston.devel.redhat.com>

Jeff,

This patchset creates a new read (readv2/preadv2) syscall(s) that take
a extra flag argument (kind of like recvmsg). What it doesn't do is
change the current behavior of of the O_NONBLOCK, if the file is
open() with O_NONBLOCK flag. It shouldn't break any existing
applications since you have to opt into using this by using the new
syscall.

I don't have a preference either way if we should create a new flag or
re-use O_NONBLOCK the flag. Instead, I'm hoping to get some consensus
here from senior kernel developers like yourself. Maybe a RWF_NONBLOCK
(I'm stealing from eventfd, EFD_NONBLOCK).

As a side note, I noticed that EFD_NONBLOCK, SFD_NONBLOCK, etc... all
alias to the value of O_NONBLOCK and there's a bunch of bug checks in
the code like this:
BUILD_BUG_ON(EFD_NONBLOCK != O_NONBLOCK);

Thanks,
- Milosz

On Mon, Sep 15, 2014 at 5:58 PM, Jeff Moyer <jmoyer@redhat.com> wrote:
> Hi, Milosz,
>
> I CC'd Michael Kerrisk, in case he has any opinions on the matter.
>
> Milosz Tanski <milosz@adfin.com> writes:
>
>> This patcheset introduces an ability to perform a non-blocking read from
>> regular files in buffered IO mode. This works by only for those filesystems
>> that have data in the page cache.
>>
>> It does this by introducing new syscalls new syscalls readv2/writev2 and
>> preadv2/pwritev2. These new syscalls behave like the network sendmsg, recvmsg
>> syscalls that accept an extra flag argument (O_NONBLOCK).
>
> I thought you were going to introduce a new flag instead of using
> O_NONBLOCK for this.  I dug up an old email that suggested that enabling
> O_NONBLOCK for regular files (well, a device node in this case) broke a
> cd ripping or burning application.  I also found this old bugzilla,
> which states that squid would fail to start, and that gqview was also
> broken:
>   https://bugzilla.redhat.com/show_bug.cgi?id=136057
>
> More generally, do you expect the open(2) of a regular file with
> O_NONBLOCK to perform the same way as a pipe, fifo, or device (namely,
> that the open itself won't block)?  Should O_NONBLOCK affect writes to
> regular files?  What do you think the return value from poll and friends
> should be when a file is opened in this manner (probably not important,
> as poll always returns data ready on regular files)?  Also consider
> whether you want the O_NONBLOCK behaviour for mandatory file locks in
> your use case (or any other, for that matter).  If you issue a read and
> it returns -EAGAIN, should it be up to the application to kick off I/O
> to ensure it makes progress?
>
> I don't think O_NONBLOCK is the right flag.  What you're really
> specifying is a flag that prevents I/O in the read path, and nowhere
> else.  As such, I'd feel much better about this if we defined a new flag
> (O_NONBLOCK_READ maybe?  No, that's too verbose.).
>
> In summary, I like the idea, but I worry about overloading O_NONBLOCK.
>
> Cheers,
> Jeff



-- 
Milosz Tanski
CTO
16 East 34th Street, 15th floor
New York, NY 10016

p: 646-253-9055
e: milosz@adfin.com

WARNING: multiple messages have this Message-ID (diff)
From: Milosz Tanski <milosz@adfin.com>
To: Jeff Moyer <jmoyer@redhat.com>
Cc: LKML <linux-kernel@vger.kernel.org>,
	Christoph Hellwig <hch@infradead.org>,
	"linux-fsdevel@vger.kernel.org" <linux-fsdevel@vger.kernel.org>,
	linux-aio@kvack.org, Mel Gorman <mgorman@suse.de>,
	Volker Lendecke <Volker.Lendecke@sernet.de>,
	Tejun Heo <tj@kernel.org>,
	michael.kerrisk@gmail.com
Subject: Re: [RFC PATCH 0/7] Non-blockling buffered fs read (page cache only)
Date: Mon, 15 Sep 2014 18:27:07 -0400	[thread overview]
Message-ID: <CANP1eJF7b3RFd4NdyCfsFuf9uyhrahVqJkDHs81nd2ZvVDwzSg@mail.gmail.com> (raw)
In-Reply-To: <x49d2awr1ez.fsf@segfault.boston.devel.redhat.com>

Jeff,

This patchset creates a new read (readv2/preadv2) syscall(s) that take
a extra flag argument (kind of like recvmsg). What it doesn't do is
change the current behavior of of the O_NONBLOCK, if the file is
open() with O_NONBLOCK flag. It shouldn't break any existing
applications since you have to opt into using this by using the new
syscall.

I don't have a preference either way if we should create a new flag or
re-use O_NONBLOCK the flag. Instead, I'm hoping to get some consensus
here from senior kernel developers like yourself. Maybe a RWF_NONBLOCK
(I'm stealing from eventfd, EFD_NONBLOCK).

As a side note, I noticed that EFD_NONBLOCK, SFD_NONBLOCK, etc... all
alias to the value of O_NONBLOCK and there's a bunch of bug checks in
the code like this:
BUILD_BUG_ON(EFD_NONBLOCK != O_NONBLOCK);

Thanks,
- Milosz

On Mon, Sep 15, 2014 at 5:58 PM, Jeff Moyer <jmoyer@redhat.com> wrote:
> Hi, Milosz,
>
> I CC'd Michael Kerrisk, in case he has any opinions on the matter.
>
> Milosz Tanski <milosz@adfin.com> writes:
>
>> This patcheset introduces an ability to perform a non-blocking read from
>> regular files in buffered IO mode. This works by only for those filesystems
>> that have data in the page cache.
>>
>> It does this by introducing new syscalls new syscalls readv2/writev2 and
>> preadv2/pwritev2. These new syscalls behave like the network sendmsg, recvmsg
>> syscalls that accept an extra flag argument (O_NONBLOCK).
>
> I thought you were going to introduce a new flag instead of using
> O_NONBLOCK for this.  I dug up an old email that suggested that enabling
> O_NONBLOCK for regular files (well, a device node in this case) broke a
> cd ripping or burning application.  I also found this old bugzilla,
> which states that squid would fail to start, and that gqview was also
> broken:
>   https://bugzilla.redhat.com/show_bug.cgi?id=136057
>
> More generally, do you expect the open(2) of a regular file with
> O_NONBLOCK to perform the same way as a pipe, fifo, or device (namely,
> that the open itself won't block)?  Should O_NONBLOCK affect writes to
> regular files?  What do you think the return value from poll and friends
> should be when a file is opened in this manner (probably not important,
> as poll always returns data ready on regular files)?  Also consider
> whether you want the O_NONBLOCK behaviour for mandatory file locks in
> your use case (or any other, for that matter).  If you issue a read and
> it returns -EAGAIN, should it be up to the application to kick off I/O
> to ensure it makes progress?
>
> I don't think O_NONBLOCK is the right flag.  What you're really
> specifying is a flag that prevents I/O in the read path, and nowhere
> else.  As such, I'd feel much better about this if we defined a new flag
> (O_NONBLOCK_READ maybe?  No, that's too verbose.).
>
> In summary, I like the idea, but I worry about overloading O_NONBLOCK.
>
> Cheers,
> Jeff



-- 
Milosz Tanski
CTO
16 East 34th Street, 15th floor
New York, NY 10016

p: 646-253-9055
e: milosz@adfin.com

--
To unsubscribe, send a message with 'unsubscribe linux-aio' in
the body to majordomo@kvack.org.  For more info on Linux AIO,
see: http://www.kvack.org/aio/
Don't email: <a href=mailto:"aart@kvack.org">aart@kvack.org</a>

  reply	other threads:[~2014-09-15 22:27 UTC|newest]

Thread overview: 167+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2014-09-15 20:20 [RFC PATCH 0/7] Non-blockling buffered fs read (page cache only) Milosz Tanski
2014-09-15 20:20 ` Milosz Tanski
2014-09-15 20:20 ` [PATCH 1/7] Prepare for adding a new readv/writev with user flags Milosz Tanski
2014-09-15 20:20   ` Milosz Tanski
2014-09-15 20:28   ` Al Viro
2014-09-15 21:15     ` Christoph Hellwig
2014-09-15 21:15       ` Christoph Hellwig
2014-09-15 21:44       ` Milosz Tanski
2014-09-15 21:44         ` Milosz Tanski
2014-09-15 20:20 ` [PATCH 2/7] Define new syscalls readv2,preadv2,writev2,pwritev2 Milosz Tanski
2014-09-15 20:20   ` Milosz Tanski
2014-09-16 19:20   ` Jeff Moyer
2014-09-16 19:20     ` Jeff Moyer
2014-09-16 19:54     ` Milosz Tanski
2014-09-16 19:54       ` Milosz Tanski
2014-09-16 21:03     ` Christoph Hellwig
2014-09-16 21:03       ` Christoph Hellwig
2014-09-17 15:43   ` Theodore Ts'o
2014-09-17 15:43     ` Theodore Ts'o
2014-09-17 16:05     ` Milosz Tanski
2014-09-17 16:05       ` Milosz Tanski
2014-09-17 16:59       ` Theodore Ts'o
2014-09-17 16:59         ` Theodore Ts'o
2014-09-17 17:24         ` Zach Brown
2014-09-17 17:24           ` Zach Brown
2014-09-15 20:20 ` [PATCH 3/7] Export new vector IO (with flags) to userland Milosz Tanski
2014-09-15 20:20   ` Milosz Tanski
2014-09-15 20:21 ` [PATCH 4/7] O_NONBLOCK flag for readv2/preadv2 Milosz Tanski
2014-09-15 20:21   ` Milosz Tanski
2014-09-16 19:19   ` Jeff Moyer
2014-09-16 19:19     ` Jeff Moyer
2014-09-16 19:44     ` Milosz Tanski
2014-09-16 19:44       ` Milosz Tanski
2014-09-16 19:53       ` Jeff Moyer
2014-09-16 19:53         ` Jeff Moyer
2014-09-15 20:21 ` [PATCH 5/7] documentation updates Christoph Hellwig
2014-09-15 20:21   ` Christoph Hellwig
2014-09-15 20:21 ` [PATCH 6/7] move flags enforcement to vfs_preadv/vfs_pwritev Christoph Hellwig
2014-09-15 21:15   ` Christoph Hellwig
2014-09-15 21:15     ` Christoph Hellwig
2014-09-15 21:45     ` Milosz Tanski
2014-09-15 21:45       ` Milosz Tanski
2014-09-15 20:22 ` [PATCH 7/7] check for O_NONBLOCK in all read_iter instances Christoph Hellwig
2014-09-15 20:22   ` Christoph Hellwig
2014-09-16 19:27   ` Jeff Moyer
2014-09-16 19:27     ` Jeff Moyer
2014-09-16 19:45     ` Milosz Tanski
2014-09-16 19:45       ` Milosz Tanski
2014-09-16 21:42       ` Dave Chinner
2014-09-16 21:42         ` Dave Chinner
2014-09-17 12:24         ` Benjamin LaHaise
2014-09-17 12:24           ` Benjamin LaHaise
2014-09-17 13:47           ` Theodore Ts'o
2014-09-17 13:47             ` Theodore Ts'o
2014-09-17 13:56             ` Benjamin LaHaise
2014-09-17 13:56               ` Benjamin LaHaise
2014-09-17 15:33               ` Milosz Tanski
2014-09-17 15:33                 ` Milosz Tanski
2014-09-17 15:49                 ` Theodore Ts'o
2014-09-17 15:49                   ` Theodore Ts'o
2014-09-17 15:52               ` Zach Brown
2014-09-17 15:52                 ` Zach Brown
2014-09-16 21:04     ` Christoph Hellwig
2014-09-16 21:04       ` Christoph Hellwig
2014-09-16 21:24       ` Jeff Moyer
2014-09-16 21:24         ` Jeff Moyer
2014-09-15 20:27 ` [RFC PATCH 0/7] Non-blockling buffered fs read (page cache only) Milosz Tanski
2014-09-15 20:27   ` Milosz Tanski
2014-09-15 21:33 ` Andreas Dilger
2014-09-15 22:13   ` Milosz Tanski
2014-09-15 22:13     ` Milosz Tanski
2014-09-15 22:36   ` Elliott, Robert (Server Storage)
2014-09-15 22:36     ` Elliott, Robert (Server Storage)
2014-09-16 18:24     ` Zach Brown
2014-09-16 18:24       ` Zach Brown
2014-09-19 11:21     ` Christoph Hellwig
2014-09-19 11:21       ` Christoph Hellwig
2014-09-22 15:48       ` Jeff Moyer
2014-09-22 15:48         ` Jeff Moyer
2014-09-22 16:32         ` Milosz Tanski
2014-09-22 16:32           ` Milosz Tanski
2014-09-22 16:42           ` Christoph Hellwig
2014-09-22 17:02             ` Milosz Tanski
2014-09-22 17:02               ` Milosz Tanski
2014-09-22 16:25       ` Elliott, Robert (Server Storage)
2014-09-15 21:58 ` Jeff Moyer
2014-09-15 21:58   ` Jeff Moyer
2014-09-15 22:27   ` Milosz Tanski [this message]
2014-09-15 22:27     ` Milosz Tanski
2014-09-16 13:44     ` Jeff Moyer
2014-09-16 13:44       ` Jeff Moyer
2014-09-19 11:23   ` Christoph Hellwig
2014-09-19 11:23     ` Christoph Hellwig
2014-09-16 19:30 ` Jeff Moyer
2014-09-16 19:30   ` Jeff Moyer
2014-09-16 20:34   ` Milosz Tanski
2014-09-16 20:34     ` Milosz Tanski
2014-09-16 20:49     ` Jeff Moyer
2014-09-16 20:49       ` Jeff Moyer
2014-09-17 14:49 ` [RFC 1/2] aio: async readahead Benjamin LaHaise
2014-09-17 14:49   ` Benjamin LaHaise
2014-09-17 15:26   ` [RFC 2/2] ext4: async readpage for indirect style inodes Benjamin LaHaise
2014-09-17 15:26     ` Benjamin LaHaise
2014-09-19 11:26   ` [RFC 1/2] aio: async readahead Christoph Hellwig
2014-09-19 11:26     ` Christoph Hellwig
2014-09-19 16:01     ` Benjamin LaHaise
2014-09-19 16:01       ` Benjamin LaHaise
2014-09-17 22:20 ` [RFC v2 0/5] Non-blockling buffered fs read (page cache only) Milosz Tanski
2014-09-17 22:20   ` Milosz Tanski
2014-09-17 22:20   ` [RFC v2 1/5] Prepare for adding a new readv/writev with user flags Milosz Tanski
2014-09-17 22:20     ` Milosz Tanski
2014-09-17 22:20   ` [RFC v2 2/5] Define new syscalls readv2,preadv2,writev2,pwritev2 Milosz Tanski
2014-09-17 22:20     ` Milosz Tanski
2014-09-18 18:48     ` Darrick J. Wong
2014-09-18 18:48       ` Darrick J. Wong
2014-09-19 10:52       ` Christoph Hellwig
2014-09-19 10:52         ` Christoph Hellwig
2014-09-20  0:19         ` Darrick J. Wong
2014-09-20  0:19           ` Darrick J. Wong
2014-09-17 22:20   ` [RFC v2 3/5] Export new vector IO (with flags) to userland Milosz Tanski
2014-09-17 22:20     ` Milosz Tanski
2014-09-17 22:20   ` [RFC v2 4/5] O_NONBLOCK flag for readv2/preadv2 Milosz Tanski
2014-09-17 22:20     ` Milosz Tanski
2014-09-19 11:27     ` Christoph Hellwig
2014-09-19 11:27       ` Christoph Hellwig
2014-09-19 11:59       ` Milosz Tanski
2014-09-19 11:59         ` Milosz Tanski
2014-09-22 17:12     ` Jeff Moyer
2014-09-22 17:12       ` Jeff Moyer
2014-09-17 22:20   ` [RFC v2 5/5] Check for O_NONBLOCK in all read_iter instances Milosz Tanski
2014-09-17 22:20     ` Milosz Tanski
2014-09-19 11:26     ` Christoph Hellwig
2014-09-19 11:26       ` Christoph Hellwig
2014-09-19 14:42   ` [RFC v2 0/5] Non-blockling buffered fs read (page cache only) Jonathan Corbet
2014-09-19 14:42     ` Jonathan Corbet
2014-09-19 16:13     ` Volker Lendecke
2014-09-19 16:13       ` Volker Lendecke
2014-09-19 17:19     ` Milosz Tanski
2014-09-19 17:19       ` Milosz Tanski
2014-09-19 17:33     ` Milosz Tanski
2014-09-19 17:33       ` Milosz Tanski
2014-09-22 14:12       ` Jonathan Corbet
2014-09-22 14:12         ` Jonathan Corbet
2014-09-22 14:24         ` Jeff Moyer
2014-09-22 14:24           ` Jeff Moyer
2014-09-22 14:25         ` Christoph Hellwig
2014-09-22 14:25           ` Christoph Hellwig
2014-09-22 14:30         ` Milosz Tanski
2014-09-22 14:30           ` Milosz Tanski
2014-09-24 21:46 ` [RFC v3 0/4] vfs: " Milosz Tanski
2014-09-24 21:46   ` Milosz Tanski
2014-09-24 21:46   ` [RFC v3 1/4] vfs: Prepare for adding a new preadv/pwritev with user flags Milosz Tanski
2014-09-24 21:46     ` Milosz Tanski
2014-09-24 21:46   ` [RFC v3 2/4] vfs: Define new syscalls preadv2,pwritev2 Milosz Tanski
2014-09-24 21:46     ` Milosz Tanski
2014-09-24 21:46   ` [RFC v3 3/4] vfs: Export new vector IO syscalls (with flags) to userland Milosz Tanski
2014-09-24 21:46     ` Milosz Tanski
2014-09-24 21:46   ` [RFC v3 4/4] vfs: RWF_NONBLOCK flag for preadv2 Milosz Tanski
2014-09-24 21:46     ` Milosz Tanski
2014-09-25  4:06   ` [RFC v3 0/4] vfs: Non-blockling buffered fs read (page cache only) Michael Kerrisk
2014-09-25  4:06     ` Michael Kerrisk
2014-09-25 11:16     ` Jan Kara
2014-09-25 11:16       ` Jan Kara
2014-09-25 15:48     ` Milosz Tanski
2014-09-25 15:48       ` Milosz Tanski
2014-10-08  2:53   ` Milosz Tanski
2014-10-08  2:53     ` Milosz Tanski

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=CANP1eJF7b3RFd4NdyCfsFuf9uyhrahVqJkDHs81nd2ZvVDwzSg@mail.gmail.com \
    --to=milosz@adfin.com \
    --cc=Volker.Lendecke@sernet.de \
    --cc=hch@infradead.org \
    --cc=jmoyer@redhat.com \
    --cc=linux-aio@kvack.org \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mgorman@suse.de \
    --cc=michael.kerrisk@gmail.com \
    --cc=tj@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.