From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-6.7 required=3.0 tests=DKIM_SIGNED,DKIM_VALID, HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_PATCH,MAILING_LIST_MULTI,SIGNED_OFF_BY, SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED autolearn=unavailable autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 98EFEC43215 for ; Wed, 13 Nov 2019 19:23:39 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.kernel.org (Postfix) with ESMTP id EA504206F0 for ; Wed, 13 Nov 2019 19:23:39 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (2048-bit key) header.d=intel-com.20150623.gappssmtp.com header.i=@intel-com.20150623.gappssmtp.com header.b="bb6122LJ" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1727053AbfKMTXi (ORCPT ); Wed, 13 Nov 2019 14:23:38 -0500 Received: from mail-oi1-f194.google.com ([209.85.167.194]:33749 "EHLO mail-oi1-f194.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726386AbfKMTXc (ORCPT ); Wed, 13 Nov 2019 14:23:32 -0500 Received: by mail-oi1-f194.google.com with SMTP id m193so2882464oig.0 for ; Wed, 13 Nov 2019 11:23:31 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=intel-com.20150623.gappssmtp.com; s=20150623; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc:content-transfer-encoding; bh=4HKVOkxUXCqQFAssnrt5HN+d13KJzjRJ1pFFkgydGzo=; b=bb6122LJyN4TYG5IK7mHh5xygfwPeVWP6I4JnFmlEb2IzPnSXePzt9c7h6VlwCNKNf 6iTeoCZ+WCekF2UjJKYypJ3VDkp2bZI1sBd8OZia/73qpNgBlVpDNRkT98ztqaV4Pdpl ZVL0JboDgCGP8I4MHVFtdYs0p8UDQrSyipyLboTogqOsqYbQVAZcsnUwAJG32rUPMPmj ssxQ90TSW2CRqTDYyVFKczonUtqdX4nXBKSakXhSzcp3X5BqgSICBAOxp0Q1Cy1lQWwq g444805boKvN2nKVTBF8Zd13UMFnq8mUuIPok109u72I0derPDbhoGvKrYVRikRhE4Pc Y6hQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc:content-transfer-encoding; bh=4HKVOkxUXCqQFAssnrt5HN+d13KJzjRJ1pFFkgydGzo=; b=tx09+3fOqQPYfaSqIVOd1sCiQW6z1ivCd88/WrBnb4phrLN/tX0jfVkJa8qlRcqOZP FCkdHPOpiUIHurJ8FOWFsEB40fR2UFgpnY3U4dVEfTFEheBmkGCFjJj4RFGpQPW0I+s4 klGtlzK9pJ9r2KG/+5V4HJ5pvY2ud4K8ZOJ1q4AqzSgwbxCVZoDTYdxa7JuU1G9XZaRW itAUk+MjWGDrlz4Bj+ZIGysz0TsWVYFh/qN0zRB+FqnPNDJ7YZylbHiIYVJcK2tGCJB8 E03/3mo8Bg6XlI2tPEoTQUTOO50/jEXg2JfkdbuhibMckKHBX9cjjKZzUFIqm7HQGWFi Zolw== X-Gm-Message-State: APjAAAVu1LANOVsC9Od1ui0FhowUJXwxXCR7vkzNmDZVv/U8X3NJMNQ6 reYZ4jH+yohRbE+4SFOFVIl0M1HsLbL7qIer4h6YGA== X-Google-Smtp-Source: APXvYqw/J/xJikM9Sn2uJoSKIkSXprx6L2hjLUq/51PEKKOZdQDk8LIwVzxcWXUccXtFtayP/c8+SilHtOvsl5tzGrQ= X-Received: by 2002:aca:ea57:: with SMTP id i84mr174326oih.73.1573673011067; Wed, 13 Nov 2019 11:23:31 -0800 (PST) MIME-Version: 1.0 References: <20191113042710.3997854-1-jhubbard@nvidia.com> <20191113042710.3997854-5-jhubbard@nvidia.com> In-Reply-To: <20191113042710.3997854-5-jhubbard@nvidia.com> From: Dan Williams Date: Wed, 13 Nov 2019 11:23:19 -0800 Message-ID: Subject: Re: [PATCH v4 04/23] mm: devmap: refactor 1-based refcounting for ZONE_DEVICE pages To: John Hubbard Cc: Andrew Morton , Al Viro , Alex Williamson , Benjamin Herrenschmidt , =?UTF-8?B?QmrDtnJuIFTDtnBlbA==?= , Christoph Hellwig , Daniel Vetter , Dave Chinner , David Airlie , "David S . Miller" , Ira Weiny , Jan Kara , Jason Gunthorpe , Jens Axboe , Jonathan Corbet , =?UTF-8?B?SsOpcsO0bWUgR2xpc3Nl?= , Magnus Karlsson , Mauro Carvalho Chehab , Michael Ellerman , Michal Hocko , Mike Kravetz , Paul Mackerras , Shuah Khan , Vlastimil Babka , bpf@vger.kernel.org, Maling list - DRI developers , KVM list , linux-block@vger.kernel.org, Linux Doc Mailing List , linux-fsdevel , linux-kselftest@vger.kernel.org, "Linux-media@vger.kernel.org" , linux-rdma , linuxppc-dev , Netdev , Linux MM , LKML Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable Sender: bpf-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: bpf@vger.kernel.org On Tue, Nov 12, 2019 at 8:27 PM John Hubbard wrote: > > An upcoming patch changes and complicates the refcounting and > especially the "put page" aspects of it. In order to keep > everything clean, refactor the devmap page release routines: > > * Rename put_devmap_managed_page() to page_is_devmap_managed(), > and limit the functionality to "read only": return a bool, > with no side effects. > > * Add a new routine, put_devmap_managed_page(), to handle checking > what kind of page it is, and what kind of refcount handling it > requires. > > * Rename __put_devmap_managed_page() to free_devmap_managed_page(), > and limit the functionality to unconditionally freeing a devmap > page. > > This is originally based on a separate patch by Ira Weiny, which > applied to an early version of the put_user_page() experiments. > Since then, J=C3=A9r=C3=B4me Glisse suggested the refactoring described a= bove. > > Suggested-by: J=C3=A9r=C3=B4me Glisse > Signed-off-by: Ira Weiny > Signed-off-by: John Hubbard > --- > include/linux/mm.h | 27 ++++++++++++++++--- > mm/memremap.c | 67 ++++++++++++++++++++-------------------------- > 2 files changed, 53 insertions(+), 41 deletions(-) > > diff --git a/include/linux/mm.h b/include/linux/mm.h > index a2adf95b3f9c..96228376139c 100644 > --- a/include/linux/mm.h > +++ b/include/linux/mm.h > @@ -967,9 +967,10 @@ static inline bool is_zone_device_page(const struct = page *page) > #endif > > #ifdef CONFIG_DEV_PAGEMAP_OPS > -void __put_devmap_managed_page(struct page *page); > +void free_devmap_managed_page(struct page *page); > DECLARE_STATIC_KEY_FALSE(devmap_managed_key); > -static inline bool put_devmap_managed_page(struct page *page) > + > +static inline bool page_is_devmap_managed(struct page *page) > { > if (!static_branch_unlikely(&devmap_managed_key)) > return false; > @@ -978,7 +979,6 @@ static inline bool put_devmap_managed_page(struct pag= e *page) > switch (page->pgmap->type) { > case MEMORY_DEVICE_PRIVATE: > case MEMORY_DEVICE_FS_DAX: > - __put_devmap_managed_page(page); > return true; > default: > break; > @@ -986,6 +986,27 @@ static inline bool put_devmap_managed_page(struct pa= ge *page) > return false; > } > > +static inline bool put_devmap_managed_page(struct page *page) > +{ > + bool is_devmap =3D page_is_devmap_managed(page); > + > + if (is_devmap) { > + int count =3D page_ref_dec_return(page); > + > + /* > + * devmap page refcounts are 1-based, rather than 0-based= : if > + * refcount is 1, then the page is free and the refcount = is > + * stable because nobody holds a reference on the page. > + */ > + if (count =3D=3D 1) > + free_devmap_managed_page(page); > + else if (!count) > + __put_page(page); > + } > + > + return is_devmap; > +} > + > #else /* CONFIG_DEV_PAGEMAP_OPS */ > static inline bool put_devmap_managed_page(struct page *page) > { > diff --git a/mm/memremap.c b/mm/memremap.c > index 03ccbdfeb697..bc7e2a27d025 100644 > --- a/mm/memremap.c > +++ b/mm/memremap.c > @@ -410,48 +410,39 @@ struct dev_pagemap *get_dev_pagemap(unsigned long p= fn, > EXPORT_SYMBOL_GPL(get_dev_pagemap); > > #ifdef CONFIG_DEV_PAGEMAP_OPS > -void __put_devmap_managed_page(struct page *page) > +void free_devmap_managed_page(struct page *page) > { > - int count =3D page_ref_dec_return(page); > + /* Clear Active bit in case of parallel mark_page_accessed */ > + __ClearPageActive(page); > + __ClearPageWaiters(page); > + > + mem_cgroup_uncharge(page); Ugh, when did all this HMM specific manipulation sneak into the generic ZONE_DEVICE path? It used to be gated by pgmap type with its own put_zone_device_private_page(). For example it's certainly unnecessary and might be broken (would need to check) to call mem_cgroup_uncharge() on a DAX page. ZONE_DEVICE users are not a monolith and the HMM use case leaks pages into code paths that DAX explicitly avoids.