From: Ira Weiny <ira.weiny@intel.com> To: Jason Gunthorpe <jgg@ziepe.ca> Cc: "Theodore Ts'o" <tytso@mit.edu>, linux-nvdimm@lists.01.org, "Dave Chinner" <david@fromorbit.com>, "Jeff Layton" <jlayton@kernel.org>, linux-kernel@vger.kernel.org, "Matthew Wilcox" <willy@infradead.org>, linux-xfs@vger.kernel.org, linux-mm@kvack.org, "Jérôme Glisse" <jglisse@redhat.com>, "John Hubbard" <jhubbard@nvidia.com>, linux-fsdevel@vger.kernel.org, "Jan Kara" <jack@suse.cz>, linux-ext4@vger.kernel.org, "Andrew Morton" <akpm@linux-foundation.org> Subject: Re: [PATCH RFC 00/10] RDMA/FS DAX truncate proposal Date: Fri, 7 Jun 2019 07:52:13 -0700 [thread overview] Message-ID: <20190607145213.GB14559@iweiny-DESK2.sc.intel.com> (raw) In-Reply-To: <20190607121729.GA14802@ziepe.ca> On Fri, Jun 07, 2019 at 09:17:29AM -0300, Jason Gunthorpe wrote: > On Fri, Jun 07, 2019 at 12:36:36PM +0200, Jan Kara wrote: > > > Because the pins would be invisible to sysadmin from that point on. > > It is not invisible, it just shows up in a rdma specific kernel > interface. You have to use rdma netlink to see the kernel object > holding this pin. > > If this visibility is the main sticking point I suggest just enhancing > the existing MR reporting to include the file info for current GUP > pins and teaching lsof to collect information from there as well so it > is easy to use. > > If the ownership of the lease transfers to the MR, and we report that > ownership to userspace in a way lsof can find, then I think all the > concerns that have been raised are met, right? I was contemplating some new lsof feature yesterday. But what I don't think we want is sysadmins to have multiple tools for multiple subsystems. Or even have to teach lsof something new for every potential new subsystem user of GUP pins. I was thinking more along the lines of reporting files which have GUP pins on them directly somewhere (dare I say procfs?) and teaching lsof to report that information. That would cover any subsystem which does a longterm pin. > > > ugly to live so we have to come up with something better. The best I can > > currently come up with is to have a method associated with the lease that > > would invalidate the RDMA context that holds the pins in the same way that > > a file close would do it. > > This is back to requiring all RDMA HW to have some new behavior they > currently don't have.. > > The main objection to the current ODP & DAX solution is that very > little HW can actually implement it, having the alternative still > require HW support doesn't seem like progress. > > I think we will eventually start seein some HW be able to do this > invalidation, but it won't be universal, and I'd rather leave it > optional, for recovery from truely catastrophic errors (ie my DAX is > on fire, I need to unplug it). Agreed. I think software wise there is not much some of the devices can do with such an "invalidate". Ira _______________________________________________ Linux-nvdimm mailing list Linux-nvdimm@lists.01.org https://lists.01.org/mailman/listinfo/linux-nvdimm
WARNING: multiple messages have this Message-ID (diff)
From: Ira Weiny <ira.weiny@intel.com> To: Jason Gunthorpe <jgg@ziepe.ca> Cc: "Jan Kara" <jack@suse.cz>, "Dan Williams" <dan.j.williams@intel.com>, "Theodore Ts'o" <tytso@mit.edu>, "Jeff Layton" <jlayton@kernel.org>, "Dave Chinner" <david@fromorbit.com>, "Matthew Wilcox" <willy@infradead.org>, linux-xfs@vger.kernel.org, "Andrew Morton" <akpm@linux-foundation.org>, "John Hubbard" <jhubbard@nvidia.com>, "Jérôme Glisse" <jglisse@redhat.com>, linux-fsdevel@vger.kernel.org, linux-kernel@vger.kernel.org, linux-nvdimm@lists.01.org, linux-ext4@vger.kernel.org, linux-mm@kvack.org Subject: Re: [PATCH RFC 00/10] RDMA/FS DAX truncate proposal Date: Fri, 7 Jun 2019 07:52:13 -0700 [thread overview] Message-ID: <20190607145213.GB14559@iweiny-DESK2.sc.intel.com> (raw) In-Reply-To: <20190607121729.GA14802@ziepe.ca> On Fri, Jun 07, 2019 at 09:17:29AM -0300, Jason Gunthorpe wrote: > On Fri, Jun 07, 2019 at 12:36:36PM +0200, Jan Kara wrote: > > > Because the pins would be invisible to sysadmin from that point on. > > It is not invisible, it just shows up in a rdma specific kernel > interface. You have to use rdma netlink to see the kernel object > holding this pin. > > If this visibility is the main sticking point I suggest just enhancing > the existing MR reporting to include the file info for current GUP > pins and teaching lsof to collect information from there as well so it > is easy to use. > > If the ownership of the lease transfers to the MR, and we report that > ownership to userspace in a way lsof can find, then I think all the > concerns that have been raised are met, right? I was contemplating some new lsof feature yesterday. But what I don't think we want is sysadmins to have multiple tools for multiple subsystems. Or even have to teach lsof something new for every potential new subsystem user of GUP pins. I was thinking more along the lines of reporting files which have GUP pins on them directly somewhere (dare I say procfs?) and teaching lsof to report that information. That would cover any subsystem which does a longterm pin. > > > ugly to live so we have to come up with something better. The best I can > > currently come up with is to have a method associated with the lease that > > would invalidate the RDMA context that holds the pins in the same way that > > a file close would do it. > > This is back to requiring all RDMA HW to have some new behavior they > currently don't have.. > > The main objection to the current ODP & DAX solution is that very > little HW can actually implement it, having the alternative still > require HW support doesn't seem like progress. > > I think we will eventually start seein some HW be able to do this > invalidation, but it won't be universal, and I'd rather leave it > optional, for recovery from truely catastrophic errors (ie my DAX is > on fire, I need to unplug it). Agreed. I think software wise there is not much some of the devices can do with such an "invalidate". Ira
next prev parent reply other threads:[~2019-06-07 14:51 UTC|newest] Thread overview: 136+ messages / expand[flat|nested] mbox.gz Atom feed top 2019-06-06 1:45 [PATCH RFC 00/10] RDMA/FS DAX truncate proposal ira.weiny 2019-06-06 1:45 ` ira.weiny 2019-06-06 1:45 ` [PATCH RFC 01/10] fs/locks: Add trace_leases_conflict ira.weiny 2019-06-09 12:52 ` Jeff Layton 2019-06-06 1:45 ` [PATCH RFC 02/10] fs/locks: Export F_LAYOUT lease to user space ira.weiny 2019-06-06 1:45 ` ira.weiny 2019-06-09 13:00 ` Jeff Layton 2019-06-09 13:00 ` Jeff Layton 2019-06-11 21:38 ` Ira Weiny 2019-06-11 21:38 ` Ira Weiny 2019-06-12 9:46 ` Jan Kara 2019-06-06 1:45 ` [PATCH RFC 03/10] mm/gup: Pass flags down to __gup_device_huge* calls ira.weiny 2019-06-06 1:45 ` ira.weiny 2019-06-06 6:18 ` Christoph Hellwig 2019-06-06 16:10 ` Ira Weiny 2019-06-06 1:45 ` [PATCH RFC 04/10] mm/gup: Ensure F_LAYOUT lease is held prior to GUP'ing pages ira.weiny 2019-06-06 1:45 ` ira.weiny 2019-06-06 1:45 ` [PATCH RFC 05/10] fs/ext4: Teach ext4 to break layout leases ira.weiny 2019-06-06 1:45 ` ira.weiny 2019-06-06 1:45 ` [PATCH RFC 06/10] fs/ext4: Teach dax_layout_busy_page() to operate on a sub-range ira.weiny 2019-06-06 1:45 ` ira.weiny 2019-06-06 1:45 ` [PATCH RFC 07/10] fs/ext4: Fail truncate if pages are GUP pinned ira.weiny 2019-06-06 1:45 ` ira.weiny 2019-06-06 10:58 ` Jan Kara 2019-06-06 10:58 ` Jan Kara 2019-06-06 16:17 ` Ira Weiny 2019-06-06 1:45 ` [PATCH RFC 08/10] fs/xfs: Teach xfs to use new dax_layout_busy_page() ira.weiny 2019-06-06 1:45 ` ira.weiny 2019-06-06 1:45 ` [PATCH RFC 09/10] fs/xfs: Fail truncate if pages are GUP pinned ira.weiny 2019-06-06 1:45 ` ira.weiny 2019-06-06 1:45 ` [PATCH RFC 10/10] mm/gup: Remove FOLL_LONGTERM DAX exclusion ira.weiny 2019-06-06 1:45 ` ira.weiny 2019-06-06 5:52 ` [PATCH RFC 00/10] RDMA/FS DAX truncate proposal John Hubbard 2019-06-06 5:52 ` John Hubbard 2019-06-06 17:11 ` Ira Weiny 2019-06-06 17:11 ` Ira Weiny 2019-06-06 19:46 ` Jason Gunthorpe 2019-06-06 10:42 ` Jan Kara 2019-06-06 15:35 ` Dan Williams 2019-06-06 19:51 ` Jason Gunthorpe 2019-06-06 22:22 ` Ira Weiny 2019-06-07 10:36 ` Jan Kara 2019-06-07 12:17 ` Jason Gunthorpe 2019-06-07 14:52 ` Ira Weiny [this message] 2019-06-07 14:52 ` Ira Weiny 2019-06-07 15:10 ` Jason Gunthorpe 2019-06-12 10:29 ` Jan Kara 2019-06-12 10:29 ` Jan Kara 2019-06-12 11:47 ` Jason Gunthorpe 2019-06-12 12:09 ` Jan Kara 2019-06-12 12:09 ` Jan Kara 2019-06-12 18:41 ` Dan Williams 2019-06-13 7:17 ` Jan Kara 2019-06-13 7:17 ` Jan Kara 2019-06-12 19:14 ` Jason Gunthorpe 2019-06-12 22:13 ` Ira Weiny 2019-06-12 22:54 ` Dan Williams 2019-06-12 22:54 ` Dan Williams 2019-06-12 23:33 ` Ira Weiny 2019-06-12 23:33 ` Ira Weiny 2019-06-13 1:14 ` Dan Williams 2019-06-13 1:14 ` Dan Williams 2019-06-13 15:13 ` Jason Gunthorpe 2019-06-13 16:25 ` Dan Williams 2019-06-13 16:25 ` Dan Williams 2019-06-13 17:18 ` Jason Gunthorpe 2019-06-13 16:53 ` Dan Williams 2019-06-13 16:53 ` Dan Williams 2019-06-13 15:12 ` Jason Gunthorpe 2019-06-13 7:53 ` Jan Kara 2019-06-13 7:53 ` Jan Kara 2019-06-12 18:49 ` Dan Williams 2019-06-12 18:49 ` Dan Williams 2019-06-13 7:43 ` Jan Kara 2019-06-06 22:03 ` Ira Weiny 2019-06-06 22:03 ` Ira Weiny 2019-06-06 22:26 ` Ira Weiny 2019-06-06 22:28 ` Dave Chinner 2019-06-07 11:04 ` Jan Kara 2019-06-07 18:25 ` Ira Weiny 2019-06-07 18:25 ` Ira Weiny 2019-06-07 18:25 ` Ira Weiny 2019-06-07 18:50 ` Jason Gunthorpe 2019-06-08 0:10 ` Dave Chinner 2019-06-08 0:10 ` Dave Chinner 2019-06-09 1:29 ` Ira Weiny 2019-06-09 1:29 ` Ira Weiny 2019-06-09 1:29 ` Ira Weiny 2019-06-12 12:37 ` Matthew Wilcox 2019-06-12 12:37 ` Matthew Wilcox 2019-06-12 12:37 ` Matthew Wilcox 2019-06-12 23:30 ` Ira Weiny 2019-06-12 23:30 ` Ira Weiny 2019-06-12 23:30 ` Ira Weiny 2019-06-13 0:55 ` Dave Chinner 2019-06-13 0:55 ` Dave Chinner 2019-06-13 0:55 ` Dave Chinner 2019-06-13 20:34 ` Ira Weiny 2019-06-13 20:34 ` Ira Weiny 2019-06-13 20:34 ` Ira Weiny 2019-06-14 3:42 ` Dave Chinner 2019-06-13 0:25 ` Dave Chinner 2019-06-13 0:25 ` Dave Chinner 2019-06-13 3:23 ` Matthew Wilcox 2019-06-13 3:23 ` Matthew Wilcox 2019-06-13 3:23 ` Matthew Wilcox 2019-06-13 4:36 ` Dave Chinner 2019-06-13 4:36 ` Dave Chinner 2019-06-13 4:36 ` Dave Chinner 2019-06-13 10:47 ` Matthew Wilcox 2019-06-13 10:47 ` Matthew Wilcox 2019-06-13 10:47 ` Matthew Wilcox 2019-06-13 15:29 ` Jason Gunthorpe 2019-06-13 15:27 ` Matthew Wilcox 2019-06-13 15:27 ` Matthew Wilcox 2019-06-13 15:27 ` Matthew Wilcox 2019-06-13 21:13 ` Ira Weiny 2019-06-13 21:13 ` Ira Weiny 2019-06-13 23:45 ` Jason Gunthorpe 2019-06-14 0:00 ` Ira Weiny 2019-06-14 0:00 ` Ira Weiny 2019-06-14 2:09 ` Dave Chinner 2019-06-14 2:09 ` Dave Chinner 2019-06-14 2:09 ` Dave Chinner 2019-06-14 2:31 ` Matthew Wilcox 2019-06-14 2:31 ` Matthew Wilcox 2019-06-14 3:07 ` Dave Chinner 2019-06-14 3:07 ` Dave Chinner 2019-06-14 3:07 ` Dave Chinner 2019-06-20 14:52 ` Jan Kara 2019-06-20 14:52 ` Jan Kara 2019-06-13 20:34 ` Ira Weiny 2019-06-13 20:34 ` Ira Weiny 2019-06-13 20:34 ` Ira Weiny 2019-06-14 2:58 ` Dave Chinner 2019-06-14 2:58 ` Dave Chinner
Reply instructions: You may reply publicly to this message via plain-text email using any one of the following methods: * Save the following mbox file, import it into your mail client, and reply-to-all from there: mbox Avoid top-posting and favor interleaved quoting: https://en.wikipedia.org/wiki/Posting_style#Interleaved_style * Reply using the --to, --cc, and --in-reply-to switches of git-send-email(1): git send-email \ --in-reply-to=20190607145213.GB14559@iweiny-DESK2.sc.intel.com \ --to=ira.weiny@intel.com \ --cc=akpm@linux-foundation.org \ --cc=david@fromorbit.com \ --cc=jack@suse.cz \ --cc=jgg@ziepe.ca \ --cc=jglisse@redhat.com \ --cc=jhubbard@nvidia.com \ --cc=jlayton@kernel.org \ --cc=linux-ext4@vger.kernel.org \ --cc=linux-fsdevel@vger.kernel.org \ --cc=linux-kernel@vger.kernel.org \ --cc=linux-mm@kvack.org \ --cc=linux-nvdimm@lists.01.org \ --cc=linux-xfs@vger.kernel.org \ --cc=tytso@mit.edu \ --cc=willy@infradead.org \ /path/to/YOUR_REPLY https://kernel.org/pub/software/scm/git/docs/git-send-email.html * If your mail client supports setting the In-Reply-To header via mailto: links, try the mailto: linkBe sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.