linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Jens Axboe <axboe@kernel.dk>
To: Jeff Layton <jlayton@kernel.org>,
	Greg Kroah-Hartman <gregkh@linuxfoundation.org>,
	Jann Horn <jannh@google.com>
Cc: Miklos Szeredi <miklos@szeredi.hu>, Karel Zak <kzak@redhat.com>,
	David Howells <dhowells@redhat.com>, Ian Kent <raven@themaw.net>,
	Christian Brauner <christian.brauner@ubuntu.com>,
	James Bottomley <James.Bottomley@hansenpartnership.com>,
	Steven Whitehouse <swhiteho@redhat.com>,
	Miklos Szeredi <mszeredi@redhat.com>,
	viro <viro@zeniv.linux.org.uk>,
	Christian Brauner <christian@brauner.io>,
	"Darrick J. Wong" <darrick.wong@oracle.com>,
	Linux API <linux-api@vger.kernel.org>,
	linux-fsdevel <linux-fsdevel@vger.kernel.org>,
	lkml <linux-kernel@vger.kernel.org>
Subject: Re: [PATCH 00/17] VFS: Filesystem information and notifications [ver #17]
Date: Tue, 3 Mar 2020 12:07:38 -0700	[thread overview]
Message-ID: <5394c5c4-aeb8-97d5-8347-e763a1abd9ed@kernel.dk> (raw)
In-Reply-To: <dbb06c63c17c23fcacdd99e8b2266804ee39ffe5.camel@kernel.org>

On 3/3/20 12:02 PM, Jeff Layton wrote:
> On Tue, 2020-03-03 at 09:55 -0700, Jens Axboe wrote:
>> On 3/3/20 9:51 AM, Jeff Layton wrote:
>>> On Tue, 2020-03-03 at 08:44 -0700, Jens Axboe wrote:
>>>> On 3/3/20 7:24 AM, Greg Kroah-Hartman wrote:
>>>>> On Tue, Mar 03, 2020 at 03:13:26PM +0100, Jann Horn wrote:
>>>>>> On Tue, Mar 3, 2020 at 3:10 PM Greg Kroah-Hartman
>>>>>> <gregkh@linuxfoundation.org> wrote:
>>>>>>> On Tue, Mar 03, 2020 at 02:43:16PM +0100, Greg Kroah-Hartman wrote:
>>>>>>>> On Tue, Mar 03, 2020 at 02:34:42PM +0100, Miklos Szeredi wrote:
>>>>>>>>> On Tue, Mar 3, 2020 at 2:14 PM Greg Kroah-Hartman
>>>>>>>>> <gregkh@linuxfoundation.org> wrote:
>>>>>>>>>
>>>>>>>>>>> Unlimited beers for a 21-line kernel patch?  Sign me up!
>>>>>>>>>>>
>>>>>>>>>>> Totally untested, barely compiled patch below.
>>>>>>>>>>
>>>>>>>>>> Ok, that didn't even build, let me try this for real now...
>>>>>>>>>
>>>>>>>>> Some comments on the interface:
>>>>>>>>
>>>>>>>> Ok, hey, let's do this proper :)
>>>>>>>
>>>>>>> Alright, how about this patch.
>>>>>>>
>>>>>>> Actually tested with some simple sysfs files.
>>>>>>>
>>>>>>> If people don't strongly object, I'll add "real" tests to it, hook it up
>>>>>>> to all arches, write a manpage, and all the fun fluff a new syscall
>>>>>>> deserves and submit it "for real".
>>>>>>
>>>>>> Just FYI, io_uring is moving towards the same kind of thing... IIRC
>>>>>> you can already use it to batch a bunch of open() calls, then batch a
>>>>>> bunch of read() calls on all the new fds and close them at the same
>>>>>> time. And I think they're planning to add support for doing
>>>>>> open()+read()+close() all in one go, too, except that it's a bit
>>>>>> complicated because passing forward the file descriptor in a generic
>>>>>> way is a bit complicated.
>>>>>
>>>>> It is complicated, I wouldn't recommend using io_ring for reading a
>>>>> bunch of procfs or sysfs files, that feels like a ton of overkill with
>>>>> too much setup/teardown to make it worth while.
>>>>>
>>>>> But maybe not, will have to watch and see how it goes.
>>>>
>>>> It really isn't, and I too thinks it makes more sense than having a
>>>> system call just for the explicit purpose of open/read/close. As Jann
>>>> said, you can't currently do a linked sequence of open/read/close,
>>>> because the fd passing between them isn't done. But that will come in
>>>> the future. If the use case is "a bunch of files", then you could
>>>> trivially do "open bunch", "read bunch", "close bunch" in three separate
>>>> steps.
>>>>
>>>> Curious what the use case is for this that warrants a special system
>>>> call?
>>>>
>>>
>>> Agreed. I'd really rather see something more general-purpose than the
>>> proposed readfile(). At least with NFS and SMB, you can compound
>>> together fairly arbitrary sorts of operations, and it'd be nice to be
>>> able to pattern calls into the kernel for those sorts of uses.
>>>
>>> So, NFSv4 has the concept of a current_stateid that is maintained by the
>>> server. So basically you can do all this (e.g.) in a single compound:
>>>
>>> open <some filehandle get a stateid>
>>> write <using that stateid>
>>> close <same stateid>
>>>
>>> It'd be nice to be able to do something similar with io_uring. Make it
>>> so that when you do an open, you set the "current fd" inside the
>>> kernel's context, and then be able to issue io_uring requests that
>>> specify a magic "fd" value that use it.
>>>
>>> That would be a really useful pattern.
>>
>> For io_uring, you can link requests that you submit into a chain. Each
>> link in the chain is done in sequence. Which means that you could do:
>>
>> <open some file><read from that file><close that file>
>>
>> in a single sequence. The only thing that is missing right now is a way
>> to have the return of that open propagated to the 'fd' of the read and
>> close, and it's actually one of the topics to discuss at LSFMM next
>> month.
>>
>> One approach would be to use BPF to handle this passing, another
>> suggestion has been to have the read/close specify some magic 'fd' value
>> that just means "inherit fd from result of previous". The latter sounds
>> very close to the stateid you mention above, and the upside here is that
>> it wouldn't explode the necessary toolchain to need to include BPF.
>>
>> In other words, this is really close to being reality and practically
>> feasible.
>>
> 
> Excellent.
> 
> Yes, the latter is exactly what I had in mind for this. I suspect that
> that would cover a large fraction of the potential use-cases for this.
> 
> Basically, all you'd need to do is keep a pointer to struct file in the
> internal state for the chain. Then, allow userland to specify some magic
> fd value for subsequent chained operations that says to use that instead
> of consulting the fdtable. Maybe use -4096 (-MAX_ERRNO - 1)?

Yeah I think that'd be a suitable way to signal that.

> That would cover the smb or nfs server sort of use cases, I think. For
> the sysfs cases, I guess you'd need to dispatch several chains, but that
> doesn't sound _too_ onerous.

The magic fd would be per-chain, so doing multiple chains wouldn't
really matter at all.

Let me try and hack this up, should be pretty trivial.

> In fact, with that you should even be able to emulate the proposed
> readlink syscall in a userland library.

Exactly

-- 
Jens Axboe


  reply	other threads:[~2020-03-03 19:07 UTC|newest]

Thread overview: 117+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-02-21 18:01 [PATCH 00/17] VFS: Filesystem information and notifications [ver #17] David Howells
2020-02-21 18:01 ` [PATCH 01/17] watch_queue: Add security hooks to rule on setting mount and sb watches " David Howells
2020-02-21 18:02 ` [PATCH 02/17] watch_queue: Implement mount topology and attribute change notifications " David Howells
2020-02-21 18:02 ` [PATCH 03/17] watch_queue: sample: Display mount tree " David Howells
2020-02-21 18:02 ` [PATCH 04/17] watch_queue: Introduce a non-repeating system-unique superblock ID " David Howells
2020-02-21 18:02 ` [PATCH 05/17] watch_queue: Add superblock notifications " David Howells
2020-02-21 18:02 ` [PATCH 06/17] watch_queue: sample: Display " David Howells
2020-02-21 18:02 ` [PATCH 07/17] fsinfo: Add fsinfo() syscall to query filesystem information " David Howells
2020-02-26  2:29   ` Aleksa Sarai
2020-02-28 14:44   ` David Howells
2020-02-21 18:02 ` [PATCH 08/17] fsinfo: Provide a bitmap of supported features " David Howells
2020-02-21 18:03 ` [PATCH 09/17] fsinfo: Allow fsinfo() to look up a mount object by ID " David Howells
2020-02-21 18:03 ` [PATCH 10/17] fsinfo: Allow mount information to be queried " David Howells
2020-03-04 14:58   ` Miklos Szeredi
2020-03-04 16:10   ` Miklos Szeredi
2020-02-21 18:03 ` [PATCH 11/17] fsinfo: sample: Mount listing program " David Howells
2020-02-21 18:03 ` [PATCH 12/17] fsinfo: Allow the mount topology propogation flags to be retrieved " David Howells
2020-02-21 18:03 ` [PATCH 13/17] fsinfo: Query superblock unique ID and notification counter " David Howells
2020-02-21 18:03 ` [PATCH 14/17] fsinfo: Add API documentation " David Howells
2020-02-21 18:03 ` [PATCH 15/17] fsinfo: Add support for AFS " David Howells
2020-02-21 18:03 ` [PATCH 16/17] fsinfo: Add example support for Ext4 " David Howells
2020-02-21 18:04 ` [PATCH 17/17] fsinfo: Add example support for NFS " David Howells
2020-02-21 20:21 ` [PATCH 00/17] VFS: Filesystem information and notifications " James Bottomley
2020-02-24 10:24   ` Miklos Szeredi
2020-02-24 14:55     ` James Bottomley
2020-02-24 15:28       ` Miklos Szeredi
2020-02-25 12:13         ` Steven Whitehouse
2020-02-25 15:28           ` James Bottomley
2020-02-25 15:47             ` Steven Whitehouse
2020-02-26  9:11             ` Miklos Szeredi
2020-02-26 10:51               ` Steven Whitehouse
2020-02-27  5:06               ` Ian Kent
2020-02-27  9:36                 ` Miklos Szeredi
2020-02-27 11:34                   ` Ian Kent
2020-02-27 13:45                     ` Miklos Szeredi
2020-02-27 15:14                       ` Karel Zak
2020-02-28  0:43                         ` Ian Kent
2020-02-28  8:35                           ` Miklos Szeredi
2020-02-28 12:27                             ` Greg Kroah-Hartman
2020-02-28 16:24                               ` Miklos Szeredi
2020-02-28 17:15                                 ` Al Viro
2020-03-02  8:43                                   ` Miklos Szeredi
2020-03-02 10:34                                 ` Karel Zak
2020-02-28 16:42                               ` David Howells
2020-02-28 15:08                             ` James Bottomley
2020-02-28 15:40                               ` Miklos Szeredi
2020-02-28  0:12                       ` Ian Kent
2020-02-28 15:52             ` Christian Brauner
2020-02-28 16:36             ` David Howells
2020-03-02  9:09               ` Miklos Szeredi
2020-03-02  9:38                 ` Greg Kroah-Hartman
2020-03-03  5:27                 ` Ian Kent
2020-03-03  7:46                   ` Miklos Szeredi
2020-03-06 16:25                     ` Miklos Szeredi
2020-03-06 19:43                       ` Al Viro
2020-03-06 19:54                         ` Miklos Szeredi
2020-03-06 19:58                         ` Al Viro
2020-03-06 20:05                           ` Al Viro
2020-03-06 20:11                             ` Miklos Szeredi
2020-03-06 20:37                             ` Al Viro
2020-03-06 20:38                               ` Al Viro
2020-03-06 20:45                                 ` Al Viro
2020-03-06 20:49                                   ` Al Viro
2020-03-06 20:51                                     ` Miklos Szeredi
2020-03-06 21:28                                       ` Al Viro
2020-03-06 20:56                                     ` Al Viro
2020-03-06 20:51                                   ` Miklos Szeredi
2020-03-07  9:48                       ` Greg Kroah-Hartman
2020-03-07 20:48                         ` Miklos Szeredi
2020-03-03  9:12                   ` David Howells
2020-03-03  9:26                     ` Miklos Szeredi
2020-03-03  9:48                       ` Miklos Szeredi
2020-03-03 10:21                         ` Steven Whitehouse
2020-03-03 10:32                           ` Miklos Szeredi
2020-03-03 11:09                             ` Ian Kent
2020-03-03 10:00                       ` Christian Brauner
2020-03-03 10:13                         ` Miklos Szeredi
2020-03-03 10:25                           ` Christian Brauner
2020-03-03 11:33                             ` Miklos Szeredi
2020-03-03 11:56                               ` Christian Brauner
2020-03-03 11:38                       ` Karel Zak
2020-03-03 13:03                         ` Greg Kroah-Hartman
2020-03-03 13:14                           ` Greg Kroah-Hartman
2020-03-03 13:34                             ` Miklos Szeredi
2020-03-03 13:43                               ` Greg Kroah-Hartman
2020-03-03 14:10                                 ` Greg Kroah-Hartman
2020-03-03 14:13                                   ` Jann Horn
2020-03-03 14:24                                     ` Greg Kroah-Hartman
2020-03-03 15:44                                       ` Jens Axboe
2020-03-03 16:37                                         ` Greg Kroah-Hartman
2020-03-03 16:51                                         ` Jeff Layton
2020-03-03 16:55                                           ` Jens Axboe
2020-03-03 19:02                                             ` Jeff Layton
2020-03-03 19:07                                               ` Jens Axboe [this message]
2020-03-03 19:23                                               ` Jens Axboe
2020-03-03 19:43                                                 ` Jeff Layton
2020-03-03 20:33                                                   ` Jens Axboe
2020-03-03 21:03                                                     ` Jeff Layton
2020-03-03 21:20                                                       ` Jens Axboe
2020-03-03 14:10                                 ` Miklos Szeredi
2020-03-03 14:29                                   ` Greg Kroah-Hartman
2020-03-03 14:40                                     ` Jann Horn
2020-03-03 16:51                                       ` Greg Kroah-Hartman
2020-03-03 16:57                                         ` Jann Horn
2020-03-03 20:15                                         ` Greg Kroah-Hartman
2020-03-03 14:40                                   ` David Howells
2020-03-04  4:20                                   ` Ian Kent
2020-03-03 14:19                                 ` David Howells
2020-03-03 16:59                                   ` Greg Kroah-Hartman
2020-03-03 14:23                               ` Christian Brauner
2020-03-03 15:23                                 ` Greg Kroah-Hartman
2020-03-03 15:53                                 ` David Howells
2020-03-04  2:01                           ` Ian Kent
2020-03-04 15:22                             ` Karel Zak
2020-03-04 16:49                               ` Greg Kroah-Hartman
2020-03-04 17:55                                 ` Karel Zak
2020-03-03 14:09                         ` David Howells

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=5394c5c4-aeb8-97d5-8347-e763a1abd9ed@kernel.dk \
    --to=axboe@kernel.dk \
    --cc=James.Bottomley@hansenpartnership.com \
    --cc=christian.brauner@ubuntu.com \
    --cc=christian@brauner.io \
    --cc=darrick.wong@oracle.com \
    --cc=dhowells@redhat.com \
    --cc=gregkh@linuxfoundation.org \
    --cc=jannh@google.com \
    --cc=jlayton@kernel.org \
    --cc=kzak@redhat.com \
    --cc=linux-api@vger.kernel.org \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=miklos@szeredi.hu \
    --cc=mszeredi@redhat.com \
    --cc=raven@themaw.net \
    --cc=swhiteho@redhat.com \
    --cc=viro@zeniv.linux.org.uk \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).