From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1753023AbdLDRBV (ORCPT ); Mon, 4 Dec 2017 12:01:21 -0500 Received: from mail-oi0-f45.google.com ([209.85.218.45]:32961 "EHLO mail-oi0-f45.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1752971AbdLDRBO (ORCPT ); Mon, 4 Dec 2017 12:01:14 -0500 X-Google-Smtp-Source: AGs4zMYk7w+i4NnNdXZ/hgHGlG0EMvKaiY3DrHLxJavhq/ZGFf1wzmp1W9cWMbPdZusb011sTvfx+qVTB1d2kK0wEyQ= MIME-Version: 1.0 In-Reply-To: <20171204093156.mp36zkcwrxkenixb@dhcp22.suse.cz> References: <20171130095323.ovrq2nenb6ztiapy@dhcp22.suse.cz> <20171130174201.stbpuye4gu5rxwkm@dhcp22.suse.cz> <20171130181741.2y5nyflyhqxg6y5p@dhcp22.suse.cz> <20171130190117.GF7754@ziepe.ca> <20171201101218.mxjyv4fc4cjwhf2o@dhcp22.suse.cz> <20171201160204.GI7754@ziepe.ca> <20171204093156.mp36zkcwrxkenixb@dhcp22.suse.cz> From: Dan Williams Date: Mon, 4 Dec 2017 09:01:12 -0800 Message-ID: Subject: Re: [PATCH v3 1/4] mm: introduce get_user_pages_longterm To: Michal Hocko Cc: Jason Gunthorpe , Andrew Morton , Linux MM , "linux-kernel@vger.kernel.org" , Christoph Hellwig , "stable@vger.kernel.org" , "linux-nvdimm@lists.01.org" , linux-rdma Content-Type: text/plain; charset="UTF-8" Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Mon, Dec 4, 2017 at 1:31 AM, Michal Hocko wrote: > > On Fri 01-12-17 08:29:53, Dan Williams wrote: > > On Fri, Dec 1, 2017 at 8:02 AM, Jason Gunthorpe wrote: > > > > > > On Fri, Dec 01, 2017 at 11:12:18AM +0100, Michal Hocko wrote: > > > > On Thu 30-11-17 12:01:17, Jason Gunthorpe wrote: > > > > > On Thu, Nov 30, 2017 at 10:32:42AM -0800, Dan Williams wrote: > > > > > > > Who and how many LRU pages can pin that way and how do you prevent nasty > > > > > > > users to DoS systems this way? > > > > > > > > > > > > I assume this is something the RDMA community has had to contend with? > > > > > > I'm not an RDMA person, I'm just here to fix dax. > > > > > > > > > > The RDMA implementation respects the mlock rlimit > > > > > > > > OK, so then I am kind of lost in why do we need a special g-u-p variant. > > > > The documentation doesn't say and quite contrary it assumes that the > > > > caller knows what he is doing. This cannot be the right approach. > > > > > > I thought it was because get_user_pages_longterm is supposed to fail > > > on DAX mappings? > > > > Correct, the rlimit checks are a separate issue, > > get_user_pages_longterm is only there to avoid open coding vma lookup > > and vma_is_fsdax() checks in multiple code paths. > > Then it is a terrible misnomer. One would expect this is a proper way to > get a longterm pin on a page. Yes, I can see that. The "get_user_pages_longterm" symbol name is encoding the lifetime expectations of the caller vs properly implementing 'longterm' pinning. However the proper interface to establish a long term pin does not currently exist needs and ultimately needs more coordination with userspace. We need a way for the kernel to explicitly revoke the pin. So, this get_user_pages_longterm change is only a stop-gap to prevent data corruption and userspace from growing further expectations that filesystem-dax supports long term pinning through the legacy interfaces. > > > And maybe we should think about moving the rlimit accounting into this > > > new function too someday? > > > > DAX pages are not accounted in any rlimit because they are statically > > allocated reserved memory regions. > > Which is OK, but how do you prevent anybody calling this function on > normal LRU pages? I don't, and didn't consider this angle as it's a consideration that is missing from the existing gup interfaces. It is an additional gap we need to fill.