All of lore.kernel.org
 help / color / mirror / Atom feed
From: Dan Williams <dan.j.williams@intel.com>
To: Al Viro <viro@zeniv.linux.org.uk>
Cc: "J. Bruce Fields" <bfields@fieldses.org>,
	linux-fsdevel <linux-fsdevel@vger.kernel.org>,
	Jeff Layton <jlayton@poochiereds.net>,
	"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>
Subject: Re: [RFC PATCH] vfs: shutdown lease notifications on file close
Date: Fri, 13 Oct 2017 10:43:34 -0700	[thread overview]
Message-ID: <CAPcyv4jMPrjaoQj32EMHvzmGAwi2j28_6B_LseS9BE5eJqqkgQ@mail.gmail.com> (raw)
In-Reply-To: <20171013170105.GF21978@ZenIV.linux.org.uk>

On Fri, Oct 13, 2017 at 10:01 AM, Al Viro <viro@zeniv.linux.org.uk> wrote:
> On Fri, Oct 13, 2017 at 08:56:10AM -0700, Dan Williams wrote:
>> While implementing MAP_DIRECT, an mmap flag that arranges for an
>> FL_LAYOUT lease to be established, Al noted:
>>
>>     You are not even guaranteed that descriptor will remain be still
>>     open by the time you pass it down to your helper, nevermind the
>>     moment when event actually happens...
>>
>> The first problem can be solved with an fd{get,put} at mmap
>> {entry,exit}.
>
> Huh?  fdget() does *NOT* guarantee that descriptor won't get closed.  What
> it does is guarantee that struct file won't get closed under you, which
> is nowhere near the same thing.  And while we are at it, it certainly
> _is_ called by mmap()...
>
>> The second problem appears to be a general issue.
>>
>> Leases follow the lifetime of the inode, so it is possible for a lease
>> to be broken after the file is closed. When that happens userspace may
>> get a notification on a stale fd. Of course it is not recommended that a
>> process close a file descriptor with an active lease, but if it does we
>> should assume that the notification is not needed either. Walk leases at
>> close time and invalidate any pending fasync instances.
>
> What the hell is special about close(2) and not, e.g. dup2(2)?  Or execve(2)
> triggering close-on-exec, etc...  Besides, you are changing a user-visible
> behaviour here.  Suppose your process forks and the child closes all
> descriptors; should that stop SIGIO delivery to the parent?
>
> Let's step back for a minute; could you describe how the userland is supposed
> to use that thing?

MAP_DIRECT is a meant as a way to safely pass DAX mappings of a file
to the RDMA sub-system, or any sub-system that follows a memory
registration design pattern. RDMA expects that once it has done
get_user_pages() that it has exclusive access to the memory backing
the file mapping indefinitely. With page cache backed file mappings we
can truncate and hole punch the file at will and the RDMA operations
will continue to pages that are no longer part of the file. Yes, that
breaks coherency, but it otherwise does not cause damage to unrelated
file blocks. With DAX we do not have the luxury of an indirect page
for the RDMA to land the operations are going straight to file blocks
in persistent memory.

With MAP_DIRECT the proposal is that when the RDMA memory registration
code sees 'vma_is_dax(vma) == true' it calls a new ->lease_direct()
vm_operation to take an FL_LAYOUT lease against the file to protect
against truncate / fallocate. Lease expiration triggers a callback to
redirect or shutdown RDMA. The filesystem mmap implemantation also
arranges for an FL_LAYOUT lease to be taken at mmap time when the fd
is available to setup a SIGIO notification.

If we don't take a lease at mmap time then we would need to develop a
notification mechanism that is specific to the RDMA code, and using
SIGIO on the mmap fd seemed a more generic solution to me.

  reply	other threads:[~2017-10-13 17:43 UTC|newest]

Thread overview: 5+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-10-13 15:56 [RFC PATCH] vfs: shutdown lease notifications on file close Dan Williams
2017-10-13 17:01 ` Al Viro
2017-10-13 17:43   ` Dan Williams [this message]
2017-10-13 18:00     ` Dan Williams
2017-10-13 18:30 ` Jeff Layton

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=CAPcyv4jMPrjaoQj32EMHvzmGAwi2j28_6B_LseS9BE5eJqqkgQ@mail.gmail.com \
    --to=dan.j.williams@intel.com \
    --cc=bfields@fieldses.org \
    --cc=jlayton@poochiereds.net \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=viro@zeniv.linux.org.uk \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.