linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: Ira Weiny <ira.weiny@intel.com>
To: John Hubbard <jhubbard@nvidia.com>
Cc: Dave Chinner <david@fromorbit.com>,
	Jason Gunthorpe <jgg@ziepe.ca>, Jan Kara <jack@suse.cz>,
	Andrew Morton <akpm@linux-foundation.org>,
	Dan Williams <dan.j.williams@intel.com>,
	Matthew Wilcox <willy@infradead.org>,
	Theodore Ts'o <tytso@mit.edu>, Michal Hocko <mhocko@suse.com>,
	linux-xfs@vger.kernel.org, linux-rdma@vger.kernel.org,
	linux-kernel@vger.kernel.org, linux-fsdevel@vger.kernel.org,
	linux-nvdimm@lists.01.org, linux-ext4@vger.kernel.org,
	linux-mm@kvack.org
Subject: Re: [RFC PATCH v2 00/19] RDMA/FS DAX truncate proposal V1,000,002 ;-)
Date: Thu, 29 Aug 2019 09:16:28 -0700	[thread overview]
Message-ID: <20190829161627.GB18249@iweiny-DESK2.sc.intel.com> (raw)
In-Reply-To: <3e5c5053-a74a-509c-660c-a6075ed87f11@nvidia.com>

On Wed, Aug 28, 2019 at 08:27:23PM -0700, John Hubbard wrote:
> On 8/28/19 7:02 PM, Ira Weiny wrote:
> > On Mon, Aug 26, 2019 at 03:55:10PM +1000, Dave Chinner wrote:
> > > On Fri, Aug 23, 2019 at 10:08:36PM -0700, Ira Weiny wrote:
> > > > On Sat, Aug 24, 2019 at 10:11:24AM +1000, Dave Chinner wrote:
> > > > > On Fri, Aug 23, 2019 at 09:04:29AM -0300, Jason Gunthorpe wrote:
> ...
> > > 
> > > Sure, that part works because the struct file is passed. It doesn't
> > > end up with the same fd number in the other process, though.
> > > 
> > > The issue is that layout leases need to notify userspace when they
> > > are broken by the kernel, so a lease stores the owner pid/tid in the
> > > file->f_owner field via __f_setown(). It also keeps a struct fasync
> > > attached to the file_lock that records the fd that the lease was
> > > created on.  When a signal needs to be sent to userspace for that
> > > lease, we call kill_fasync() and that walks the list of fasync
> > > structures on the lease and calls:
> > > 
> > > 	send_sigio(fown, fa->fa_fd, band);
> > > 
> > > And it does for every fasync struct attached to a lease. Yes, a
> > > lease can track multiple fds, but it can only track them in a single
> > > process context. The moment the struct file is shared with another
> > > process, the lease is no longer capable of sending notifications to
> > > all the lease holders.
> > > 
> > > Yes, you can change the owning process via F_SETOWNER, but that's
> > > still only a single process context, and you can't change the fd in
> > > the fasync list. You can add new fd to an existing lease by calling
> > > F_SETLEASE on the new fd, but you still only have a single process
> > > owner context for signal delivery.
> > > 
> > > As such, leases that require callbacks to userspace are currently
> > > only valid within the process context the lease was taken in.
> > 
> > But for long term pins we are not requiring callbacks.
> > 
> 
> Hi Ira,
> 
> If "require callbacks to userspace" means sending SIGIO, then actually
> FOLL_LONGTERM *does* require those callbacks. Because we've been, so
> far, equating FOLL_LONGTERM with the vaddr_pin struct and with a lease.
> 
> What am I missing here?

We agreed back in June that the layout lease would have 2 "levels".  The
"normal" layout lease would cause SIGIO and could be broken and another
"exclusive" level which could _not_ be broken.

Because we _can't_ _trust_ user space to react to the SIGIO properly the
"exclusive" lease is required to take the longterm pins.  Also this is the
lease which causes the truncate to fail (return ETXTBSY) because the kernel
can't break the lease.

The vaddr_pin struct in the current RFC is there for a couple of reasons.

1) To ensure that we have a way to correlate the long term pin user with the
   file if the data file FD's are closed.  (ie the application has zombie'd the
   lease).

2) And more importantly as a token the vaddr_pin*() callers use to be able to
   properly ref count the file itself while in use.

Ira



  reply	other threads:[~2019-08-29 16:16 UTC|newest]

Thread overview: 110+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-08-09 22:58 [RFC PATCH v2 00/19] RDMA/FS DAX truncate proposal V1,000,002 ;-) ira.weiny
2019-08-09 22:58 ` [RFC PATCH v2 01/19] fs/locks: Export F_LAYOUT lease to user space ira.weiny
2019-08-09 23:52   ` Dave Chinner
2019-08-12 17:36     ` Ira Weiny
2019-08-14  8:05       ` Dave Chinner
2019-08-14 11:21         ` Jeff Layton
2019-08-14 11:38           ` Dave Chinner
2019-08-09 22:58 ` [RFC PATCH v2 02/19] fs/locks: Add Exclusive flag to user Layout lease ira.weiny
2019-08-14 14:15   ` Jeff Layton
2019-08-14 21:56     ` Dave Chinner
2019-08-26 10:41       ` Jeff Layton
2019-08-29 23:34         ` Ira Weiny
2019-09-04 12:52           ` Jeff Layton
2019-09-04 23:12   ` John Hubbard
2019-08-09 22:58 ` [RFC PATCH v2 03/19] mm/gup: Pass flags down to __gup_device_huge* calls ira.weiny
2019-08-09 22:58 ` [RFC PATCH v2 04/19] mm/gup: Ensure F_LAYOUT lease is held prior to GUP'ing pages ira.weiny
2019-08-09 22:58 ` [RFC PATCH v2 05/19] fs/ext4: Teach ext4 to break layout leases ira.weiny
2019-08-09 22:58 ` [RFC PATCH v2 06/19] fs/ext4: Teach dax_layout_busy_page() to operate on a sub-range ira.weiny
2019-08-23 15:18   ` Vivek Goyal
2019-08-29 18:52     ` Ira Weiny
2019-08-09 22:58 ` [RFC PATCH v2 07/19] fs/xfs: Teach xfs to use new dax_layout_busy_page() ira.weiny
2019-08-09 23:30   ` Dave Chinner
2019-08-12 18:05     ` Ira Weiny
2019-08-14  8:04       ` Dave Chinner
2019-08-09 22:58 ` [RFC PATCH v2 08/19] fs/xfs: Fail truncate if page lease can't be broken ira.weiny
2019-08-09 23:22   ` Dave Chinner
2019-08-12 18:08     ` Ira Weiny
2019-08-09 22:58 ` [RFC PATCH v2 09/19] mm/gup: Introduce vaddr_pin structure ira.weiny
2019-08-10  0:06   ` John Hubbard
2019-08-09 22:58 ` [RFC PATCH v2 10/19] mm/gup: Pass a NULL vaddr_pin through GUP fast ira.weiny
2019-08-10  0:06   ` John Hubbard
2019-08-09 22:58 ` [RFC PATCH v2 11/19] mm/gup: Pass follow_page_context further down the call stack ira.weiny
2019-08-10  0:18   ` John Hubbard
2019-08-12 19:01     ` Ira Weiny
2019-08-09 22:58 ` [RFC PATCH v2 12/19] mm/gup: Prep put_user_pages() to take an vaddr_pin struct ira.weiny
2019-08-10  0:30   ` John Hubbard
2019-08-12 20:46     ` Ira Weiny
2019-08-09 22:58 ` [RFC PATCH v2 13/19] {mm,file}: Add file_pins objects ira.weiny
2019-08-09 22:58 ` [RFC PATCH v2 14/19] fs/locks: Associate file pins while performing GUP ira.weiny
2019-08-09 22:58 ` [RFC PATCH v2 15/19] mm/gup: Introduce vaddr_pin_pages() ira.weiny
2019-08-10  0:09   ` John Hubbard
2019-08-12 21:00     ` Ira Weiny
2019-08-12 21:20       ` John Hubbard
2019-08-11 23:07   ` John Hubbard
2019-08-12 21:01     ` Ira Weiny
2019-08-12 12:28   ` Jason Gunthorpe
2019-08-12 21:48     ` Ira Weiny
2019-08-13 11:47       ` Jason Gunthorpe
2019-08-13 17:46         ` Ira Weiny
2019-08-13 17:56           ` John Hubbard
2019-08-09 22:58 ` [RFC PATCH v2 16/19] RDMA/uverbs: Add back pointer to system file object ira.weiny
2019-08-12 13:00   ` Jason Gunthorpe
2019-08-12 17:28     ` Ira Weiny
2019-08-12 17:56       ` Jason Gunthorpe
2019-08-12 21:15         ` Ira Weiny
2019-08-13 11:48           ` Jason Gunthorpe
2019-08-13 17:41             ` Ira Weiny
2019-08-13 18:00               ` Jason Gunthorpe
2019-08-13 20:38                 ` Ira Weiny
2019-08-14 12:23                   ` Jason Gunthorpe
2019-08-14 17:50                     ` Ira Weiny
2019-08-14 18:15                       ` Jason Gunthorpe
2019-09-04 22:25                     ` Ira Weiny
2019-09-11  8:19                       ` Jason Gunthorpe
2019-08-09 22:58 ` [RFC PATCH v2 17/19] RDMA/umem: Convert to vaddr_[pin|unpin]* operations ira.weiny
2019-08-09 22:58 ` [RFC PATCH v2 18/19] {mm,procfs}: Add display file_pins proc ira.weiny
2019-08-09 22:58 ` [RFC PATCH v2 19/19] mm/gup: Remove FOLL_LONGTERM DAX exclusion ira.weiny
2019-08-14 10:17 ` [RFC PATCH v2 00/19] RDMA/FS DAX truncate proposal V1,000,002 ;-) Jan Kara
2019-08-14 18:08   ` Ira Weiny
2019-08-15 13:05     ` Jan Kara
2019-08-16 19:05       ` Ira Weiny
2019-08-16 23:20         ` [RFC PATCH v2 00/19] RDMA/FS DAX truncate proposal V1,000,002 ; -) Ira Weiny
2019-08-19  6:36           ` Jan Kara
2019-08-17  2:26         ` [RFC PATCH v2 00/19] RDMA/FS DAX truncate proposal V1,000,002 ;-) Dave Chinner
2019-08-19  6:34           ` Jan Kara
2019-08-19  9:24             ` Dave Chinner
2019-08-19 12:38               ` Jason Gunthorpe
2019-08-19 21:53                 ` Ira Weiny
2019-08-20  1:12                 ` Dave Chinner
2019-08-20 11:55                   ` Jason Gunthorpe
2019-08-21 18:02                     ` Ira Weiny
2019-08-21 18:13                       ` Jason Gunthorpe
2019-08-21 18:22                         ` John Hubbard
2019-08-21 18:57                         ` Ira Weiny
2019-08-21 19:06                           ` Ira Weiny
2019-08-21 19:48                           ` Jason Gunthorpe
2019-08-21 20:44                             ` Ira Weiny
2019-08-21 23:49                               ` Jason Gunthorpe
2019-08-23  3:23                               ` Dave Chinner
2019-08-23 12:04                                 ` Jason Gunthorpe
2019-08-24  0:11                                   ` Dave Chinner
2019-08-24  5:08                                     ` Ira Weiny
2019-08-26  5:55                                       ` Dave Chinner
2019-08-29  2:02                                         ` Ira Weiny
2019-08-29  3:27                                           ` John Hubbard
2019-08-29 16:16                                             ` Ira Weiny [this message]
2019-09-02 22:26                                           ` Dave Chinner
2019-09-04 16:54                                             ` Ira Weiny
2019-08-25 19:39                                     ` Jason Gunthorpe
2019-08-24  4:49                                 ` Ira Weiny
2019-08-25 19:40                                   ` Jason Gunthorpe
2019-08-23  0:59                       ` Dave Chinner
2019-08-23 17:15                         ` Ira Weiny
2019-08-24  0:18                           ` Dave Chinner
2019-08-20  0:05               ` John Hubbard
2019-08-20  1:20                 ` Dave Chinner
2019-08-20  3:09                   ` John Hubbard
2019-08-20  3:36                     ` Dave Chinner
2019-08-21 18:43                       ` John Hubbard
2019-08-21 19:09                         ` Ira Weiny

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20190829161627.GB18249@iweiny-DESK2.sc.intel.com \
    --to=ira.weiny@intel.com \
    --cc=akpm@linux-foundation.org \
    --cc=dan.j.williams@intel.com \
    --cc=david@fromorbit.com \
    --cc=jack@suse.cz \
    --cc=jgg@ziepe.ca \
    --cc=jhubbard@nvidia.com \
    --cc=linux-ext4@vger.kernel.org \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=linux-nvdimm@lists.01.org \
    --cc=linux-rdma@vger.kernel.org \
    --cc=linux-xfs@vger.kernel.org \
    --cc=mhocko@suse.com \
    --cc=tytso@mit.edu \
    --cc=willy@infradead.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).