nvdimm.lists.linux.dev archive mirror
 help / color / mirror / Atom feed
From: Ira Weiny <ira.weiny@intel.com>
To: Jason Gunthorpe <jgg@ziepe.ca>
Cc: "Jan Kara" <jack@suse.cz>,
	"Dan Williams" <dan.j.williams@intel.com>,
	"Theodore Ts'o" <tytso@mit.edu>,
	"Jeff Layton" <jlayton@kernel.org>,
	"Dave Chinner" <david@fromorbit.com>,
	"Matthew Wilcox" <willy@infradead.org>,
	linux-xfs@vger.kernel.org,
	"Andrew Morton" <akpm@linux-foundation.org>,
	"John Hubbard" <jhubbard@nvidia.com>,
	"Jérôme Glisse" <jglisse@redhat.com>,
	linux-fsdevel@vger.kernel.org, linux-kernel@vger.kernel.org,
	linux-nvdimm@lists.01.org, linux-ext4@vger.kernel.org,
	linux-mm@kvack.org
Subject: Re: [PATCH RFC 00/10] RDMA/FS DAX truncate proposal
Date: Wed, 12 Jun 2019 15:13:36 -0700	[thread overview]
Message-ID: <20190612221336.GA27080@iweiny-DESK2.sc.intel.com> (raw)
In-Reply-To: <20190612191421.GM3876@ziepe.ca>

On Wed, Jun 12, 2019 at 04:14:21PM -0300, Jason Gunthorpe wrote:
> On Wed, Jun 12, 2019 at 02:09:07PM +0200, Jan Kara wrote:
> > On Wed 12-06-19 08:47:21, Jason Gunthorpe wrote:
> > > On Wed, Jun 12, 2019 at 12:29:17PM +0200, Jan Kara wrote:
> > > 
> > > > > > The main objection to the current ODP & DAX solution is that very
> > > > > > little HW can actually implement it, having the alternative still
> > > > > > require HW support doesn't seem like progress.
> > > > > > 
> > > > > > I think we will eventually start seein some HW be able to do this
> > > > > > invalidation, but it won't be universal, and I'd rather leave it
> > > > > > optional, for recovery from truely catastrophic errors (ie my DAX is
> > > > > > on fire, I need to unplug it).
> > > > > 
> > > > > Agreed.  I think software wise there is not much some of the devices can do
> > > > > with such an "invalidate".
> > > > 
> > > > So out of curiosity: What does RDMA driver do when userspace just closes
> > > > the file pointing to RDMA object? It has to handle that somehow by aborting
> > > > everything that's going on... And I wanted similar behavior here.
> > > 
> > > It aborts *everything* connected to that file descriptor. Destroying
> > > everything avoids creating inconsistencies that destroying a subset
> > > would create.
> > > 
> > > What has been talked about for lease break is not destroying anything
> > > but very selectively saying that one memory region linked to the GUP
> > > is no longer functional.
> > 
> > OK, so what I had in mind was that if RDMA app doesn't play by the rules
> > and closes the file with existing pins (and thus layout lease) we would
> > force it to abort everything. Yes, it is disruptive but then the app didn't
> > obey the rule that it has to maintain file lease while holding pins. Thus
> > such situation should never happen unless the app is malicious / buggy.
> 
> We do have the infrastructure to completely revoke the entire
> *content* of a FD (this is called device disassociate). It is
> basically close without the app doing close. But again it only works
> with some drivers. However, this is more likely something a driver
> could support without a HW change though.
> 
> It is quite destructive as it forcibly kills everything RDMA related
> the process(es) are doing, but it is less violent than SIGKILL, and
> there is perhaps a way for the app to recover from this, if it is
> coded for it.

I don't think many are...  I think most would effectively be "killed" if this
happened to them.

> 
> My preference would be to avoid this scenario, but if it is really
> necessary, we could probably build it with some work.
> 
> The only case we use it today is forced HW hot unplug, so it is rarely
> used and only for an 'emergency' like use case.

I'd really like to avoid this as well.  I think it will be very confusing for
RDMA apps to have their context suddenly be invalid.  I think if we have a way
for admins to ID who is pinning a file the admin can take more appropriate
action on those processes.   Up to and including killing the process.

Ira

  reply	other threads:[~2019-06-12 22:13 UTC|newest]

Thread overview: 76+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-06-06  1:45 [PATCH RFC 00/10] RDMA/FS DAX truncate proposal ira.weiny
2019-06-06  1:45 ` [PATCH RFC 01/10] fs/locks: Add trace_leases_conflict ira.weiny
2019-06-09 12:52   ` Jeff Layton
2019-06-06  1:45 ` [PATCH RFC 02/10] fs/locks: Export F_LAYOUT lease to user space ira.weiny
2019-06-09 13:00   ` Jeff Layton
2019-06-11 21:38     ` Ira Weiny
2019-06-12  9:46       ` Jan Kara
2019-06-06  1:45 ` [PATCH RFC 03/10] mm/gup: Pass flags down to __gup_device_huge* calls ira.weiny
2019-06-06  6:18   ` Christoph Hellwig
2019-06-06 16:10     ` Ira Weiny
2019-06-06  1:45 ` [PATCH RFC 04/10] mm/gup: Ensure F_LAYOUT lease is held prior to GUP'ing pages ira.weiny
2019-06-06  1:45 ` [PATCH RFC 05/10] fs/ext4: Teach ext4 to break layout leases ira.weiny
2019-06-06  1:45 ` [PATCH RFC 06/10] fs/ext4: Teach dax_layout_busy_page() to operate on a sub-range ira.weiny
2019-06-06  1:45 ` [PATCH RFC 07/10] fs/ext4: Fail truncate if pages are GUP pinned ira.weiny
2019-06-06 10:58   ` Jan Kara
2019-06-06 16:17     ` Ira Weiny
2019-06-06  1:45 ` [PATCH RFC 08/10] fs/xfs: Teach xfs to use new dax_layout_busy_page() ira.weiny
2019-06-06  1:45 ` [PATCH RFC 09/10] fs/xfs: Fail truncate if pages are GUP pinned ira.weiny
2019-06-06  1:45 ` [PATCH RFC 10/10] mm/gup: Remove FOLL_LONGTERM DAX exclusion ira.weiny
2019-06-06  5:52 ` [PATCH RFC 00/10] RDMA/FS DAX truncate proposal John Hubbard
2019-06-06 17:11   ` Ira Weiny
2019-06-06 19:46     ` Jason Gunthorpe
2019-06-06 10:42 ` Jan Kara
2019-06-06 15:35   ` Dan Williams
2019-06-06 19:51   ` Jason Gunthorpe
2019-06-06 22:22     ` Ira Weiny
2019-06-07 10:36       ` Jan Kara
2019-06-07 12:17         ` Jason Gunthorpe
2019-06-07 14:52           ` Ira Weiny
2019-06-07 15:10             ` Jason Gunthorpe
2019-06-12 10:29             ` Jan Kara
2019-06-12 11:47               ` Jason Gunthorpe
2019-06-12 12:09                 ` Jan Kara
2019-06-12 18:41                   ` Dan Williams
2019-06-13  7:17                     ` Jan Kara
2019-06-12 19:14                   ` Jason Gunthorpe
2019-06-12 22:13                     ` Ira Weiny [this message]
2019-06-12 22:54                       ` Dan Williams
2019-06-12 23:33                         ` Ira Weiny
2019-06-13  1:14                           ` Dan Williams
2019-06-13 15:13                             ` Jason Gunthorpe
2019-06-13 16:25                               ` Dan Williams
2019-06-13 17:18                                 ` Jason Gunthorpe
2019-06-13 16:53                           ` Dan Williams
2019-06-13 15:12                         ` Jason Gunthorpe
2019-06-13  7:53                       ` Jan Kara
2019-06-12 18:49               ` Dan Williams
2019-06-13  7:43                 ` Jan Kara
2019-06-06 22:03   ` Ira Weiny
2019-06-06 22:26     ` Ira Weiny
2019-06-06 22:28     ` Dave Chinner
2019-06-07 11:04     ` Jan Kara
2019-06-07 18:25       ` Ira Weiny
2019-06-07 18:50         ` Jason Gunthorpe
2019-06-08  0:10         ` Dave Chinner
2019-06-09  1:29           ` Ira Weiny
2019-06-12 12:37           ` Matthew Wilcox
2019-06-12 23:30             ` Ira Weiny
2019-06-13  0:55               ` Dave Chinner
2019-06-13 20:34                 ` Ira Weiny
2019-06-14  3:42                   ` Dave Chinner
2019-06-13  0:25             ` Dave Chinner
2019-06-13  3:23               ` Matthew Wilcox
2019-06-13  4:36                 ` Dave Chinner
2019-06-13 10:47                   ` Matthew Wilcox
2019-06-13 15:29                 ` Jason Gunthorpe
2019-06-13 15:27               ` Matthew Wilcox
2019-06-13 21:13                 ` Ira Weiny
2019-06-13 23:45                   ` Jason Gunthorpe
2019-06-14  0:00                     ` Ira Weiny
2019-06-14  2:09                     ` Dave Chinner
2019-06-14  2:31                       ` Matthew Wilcox
2019-06-14  3:07                         ` Dave Chinner
2019-06-20 14:52                 ` Jan Kara
2019-06-13 20:34               ` Ira Weiny
2019-06-14  2:58                 ` Dave Chinner

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20190612221336.GA27080@iweiny-DESK2.sc.intel.com \
    --to=ira.weiny@intel.com \
    --cc=akpm@linux-foundation.org \
    --cc=dan.j.williams@intel.com \
    --cc=david@fromorbit.com \
    --cc=jack@suse.cz \
    --cc=jgg@ziepe.ca \
    --cc=jglisse@redhat.com \
    --cc=jhubbard@nvidia.com \
    --cc=jlayton@kernel.org \
    --cc=linux-ext4@vger.kernel.org \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=linux-nvdimm@lists.01.org \
    --cc=linux-xfs@vger.kernel.org \
    --cc=tytso@mit.edu \
    --cc=willy@infradead.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).