From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: from mail-oi1-x243.google.com (mail-oi1-x243.google.com [IPv6:2607:f8b0:4864:20::243]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by ml01.01.org (Postfix) with ESMTPS id CAE4A2129604D for ; Wed, 12 Jun 2019 11:50:03 -0700 (PDT) Received: by mail-oi1-x243.google.com with SMTP id t76so12480272oih.4 for ; Wed, 12 Jun 2019 11:50:03 -0700 (PDT) MIME-Version: 1.0 References: <20190606014544.8339-1-ira.weiny@intel.com> <20190606104203.GF7433@quack2.suse.cz> <20190606195114.GA30714@ziepe.ca> <20190606222228.GB11698@iweiny-DESK2.sc.intel.com> <20190607103636.GA12765@quack2.suse.cz> <20190607121729.GA14802@ziepe.ca> <20190607145213.GB14559@iweiny-DESK2.sc.intel.com> <20190612102917.GB14578@quack2.suse.cz> In-Reply-To: <20190612102917.GB14578@quack2.suse.cz> From: Dan Williams Date: Wed, 12 Jun 2019 11:49:52 -0700 Message-ID: Subject: Re: [PATCH RFC 00/10] RDMA/FS DAX truncate proposal List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit Errors-To: linux-nvdimm-bounces@lists.01.org Sender: "Linux-nvdimm" To: Jan Kara Cc: Theodore Ts'o , linux-nvdimm , Dave Chinner , Jeff Layton , Linux Kernel Mailing List , Matthew Wilcox , linux-xfs , Jason Gunthorpe , =?UTF-8?B?SsOpcsO0bWUgR2xpc3Nl?= , John Hubbard , linux-fsdevel , Andrew Morton , linux-ext4 , Linux MM List-ID: On Wed, Jun 12, 2019 at 3:29 AM Jan Kara wrote: > > On Fri 07-06-19 07:52:13, Ira Weiny wrote: > > On Fri, Jun 07, 2019 at 09:17:29AM -0300, Jason Gunthorpe wrote: > > > On Fri, Jun 07, 2019 at 12:36:36PM +0200, Jan Kara wrote: > > > > > > > Because the pins would be invisible to sysadmin from that point on. > > > > > > It is not invisible, it just shows up in a rdma specific kernel > > > interface. You have to use rdma netlink to see the kernel object > > > holding this pin. > > > > > > If this visibility is the main sticking point I suggest just enhancing > > > the existing MR reporting to include the file info for current GUP > > > pins and teaching lsof to collect information from there as well so it > > > is easy to use. > > > > > > If the ownership of the lease transfers to the MR, and we report that > > > ownership to userspace in a way lsof can find, then I think all the > > > concerns that have been raised are met, right? > > > > I was contemplating some new lsof feature yesterday. But what I don't > > think we want is sysadmins to have multiple tools for multiple > > subsystems. Or even have to teach lsof something new for every potential > > new subsystem user of GUP pins. > > Agreed. > > > I was thinking more along the lines of reporting files which have GUP > > pins on them directly somewhere (dare I say procfs?) and teaching lsof to > > report that information. That would cover any subsystem which does a > > longterm pin. > > So lsof already parses /proc//maps to learn about files held open by > memory mappings. It could parse some other file as well I guess. The good > thing about that would be that then "longterm pin" structure would just hold > struct file reference. That would avoid any needs of special behavior on > file close (the file reference in the "longterm pin" structure would make > sure struct file and thus the lease stays around, we'd just need to make > explicit lease unlock block until the "longterm pin" structure is freed). > The bad thing is that it requires us to come up with a sane new proc > interface for reporting "longterm pins" and associated struct file. Also we > need to define what this interface shows if the pinned pages are in DRAM > (either page cache or anon) and not on NVDIMM. The anon vs shared detection case is important because a longterm pin might be blocking a memory-hot-unplug operation if it is pinning ZONE_MOVABLE memory, but I don't think we want DRAM vs NVDIMM to be an explicit concern of the interface. For the anon / cached case I expect it might be useful to put that communication under the memory-blocks sysfs interface. I.e. a list of pids that are pinning that memory-block from being hot-unplugged. _______________________________________________ Linux-nvdimm mailing list Linux-nvdimm@lists.01.org https://lists.01.org/mailman/listinfo/linux-nvdimm