linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Dan Williams <dan.j.williams@intel.com>
To: "Kirill A. Shutemov" <kirill.shutemov@linux.intel.com>
Cc: "Kirill A. Shutemov" <kirill@shutemov.name>,
	Linux MM <linux-mm@kvack.org>,
	Catalin Marinas <catalin.marinas@arm.com>,
	"Aneesh Kumar K.V" <aneesh.kumar@linux.vnet.ibm.com>,
	Steve Capper <steve.capper@linaro.org>,
	Thomas Gleixner <tglx@linutronix.de>,
	Peter Zijlstra <peterz@infradead.org>,
	Linux Kernel Mailing List <linux-kernel@vger.kernel.org>,
	Ingo Molnar <mingo@kernel.org>,
	Andrew Morton <akpm@linux-foundation.org>,
	"H. Peter Anvin" <hpa@zytor.com>,
	Dave Hansen <dave.hansen@intel.com>,
	Borislav Petkov <bp@alien8.de>, Rik van Riel <riel@redhat.com>,
	Dann Frazier <dann.frazier@canonical.com>,
	Linus Torvalds <torvalds@linux-foundation.org>,
	Michal Hocko <mhocko@suse.cz>,
	linux-tip-commits@vger.kernel.org
Subject: Re: get_zone_device_page() in get_page() and page_cache_get_speculative()
Date: Tue, 25 Apr 2017 09:44:12 -0700	[thread overview]
Message-ID: <CAPcyv4h7+Rgs83JefhJajHtitPWUFEKKgUt-_e-bqhQZM5L2FA@mail.gmail.com> (raw)
In-Reply-To: <20170425131904.nu5dlhweblwzyeit@black.fi.intel.com>

On Tue, Apr 25, 2017 at 6:19 AM, Kirill A. Shutemov
<kirill.shutemov@linux.intel.com> wrote:
> On Mon, Apr 24, 2017 at 11:41:51AM -0700, Dan Williams wrote:
>> On Mon, Apr 24, 2017 at 11:25 AM, Kirill A. Shutemov
>> <kirill.shutemov@linux.intel.com> wrote:
>> > On Mon, Apr 24, 2017 at 09:01:58PM +0300, Kirill A. Shutemov wrote:
>> >> On Mon, Apr 24, 2017 at 10:47:43AM -0700, Dan Williams wrote:
>> >> I think it's still better to do it on page_ref_* level.
>> >
>> > Something like patch below? What do you think?
>>
>> From a quick glance, I think this looks like the right way to go.
>
> Okay, but I still would like to remove manipulation with pgmap->ref from
> hot path.
>
> Can we just check that page_count() match our expectation on
> devm_memremap_pages_release() instead of this?
>
> I probably miss something in bigger picture, but would something like
> patch work too? It seems work for the test case.

No, unfortunately this is broken. It should be perfectly legal to
start the driver shutdown process while page references are still
outstanding. We use the percpu-ref infrastructure to wait for those
references to be dropped. With the approach below we'll just race and
crash.

>
> diff --git a/include/linux/mm.h b/include/linux/mm.h
> index a835edd2db34..695da2a19b4c 100644
> --- a/include/linux/mm.h
> +++ b/include/linux/mm.h
> @@ -762,19 +762,11 @@ static inline enum zone_type page_zonenum(const struct page *page)
>  }
>
>  #ifdef CONFIG_ZONE_DEVICE
> -void get_zone_device_page(struct page *page);
> -void put_zone_device_page(struct page *page);
>  static inline bool is_zone_device_page(const struct page *page)
>  {
>         return page_zonenum(page) == ZONE_DEVICE;
>  }
>  #else
> -static inline void get_zone_device_page(struct page *page)
> -{
> -}
> -static inline void put_zone_device_page(struct page *page)
> -{
> -}
>  static inline bool is_zone_device_page(const struct page *page)
>  {
>         return false;
> @@ -790,9 +782,6 @@ static inline void get_page(struct page *page)
>          */
>         VM_BUG_ON_PAGE(page_ref_count(page) <= 0, page);
>         page_ref_inc(page);
> -
> -       if (unlikely(is_zone_device_page(page)))
> -               get_zone_device_page(page);
>  }
>
>  static inline void put_page(struct page *page)
> @@ -801,9 +790,6 @@ static inline void put_page(struct page *page)
>
>         if (put_page_testzero(page))
>                 __put_page(page);
> -
> -       if (unlikely(is_zone_device_page(page)))
> -               put_zone_device_page(page);
>  }
>
>  #if defined(CONFIG_SPARSEMEM) && !defined(CONFIG_SPARSEMEM_VMEMMAP)
> diff --git a/kernel/memremap.c b/kernel/memremap.c
> index 07e85e5229da..e542bb2f7ab0 100644
> --- a/kernel/memremap.c
> +++ b/kernel/memremap.c
> @@ -182,18 +182,6 @@ struct page_map {
>         struct vmem_altmap altmap;
>  };
>
> -void get_zone_device_page(struct page *page)
> -{
> -       percpu_ref_get(page->pgmap->ref);
> -}
> -EXPORT_SYMBOL(get_zone_device_page);
> -
> -void put_zone_device_page(struct page *page)
> -{
> -       put_dev_pagemap(page->pgmap);
> -}
> -EXPORT_SYMBOL(put_zone_device_page);
> -
>  static void pgmap_radix_release(struct resource *res)
>  {
>         resource_size_t key, align_start, align_size, align_end;
> @@ -237,12 +225,21 @@ static void devm_memremap_pages_release(struct device *dev, void *data)
>         struct resource *res = &page_map->res;
>         resource_size_t align_start, align_size;
>         struct dev_pagemap *pgmap = &page_map->pgmap;
> +       unsigned long pfn;
>
>         if (percpu_ref_tryget_live(pgmap->ref)) {
>                 dev_WARN(dev, "%s: page mapping is still live!\n", __func__);
>                 percpu_ref_put(pgmap->ref);
>         }
>
> +       for_each_device_pfn(pfn, page_map) {
> +               struct page *page = pfn_to_page(pfn);
> +
> +               dev_WARN_ONCE(dev, page_count(page) != 1,
> +                               "%s: unexpected page count: %d!\n",
> +                               __func__, page_count(page));
> +       }
> +
>         /* pages are dead and unused, undo the arch mapping */
>         align_start = res->start & ~(SECTION_SIZE - 1);
>         align_size = ALIGN(resource_size(res), SECTION_SIZE);
> --
>  Kirill A. Shutemov

  reply	other threads:[~2017-04-25 16:44 UTC|newest]

Thread overview: 48+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-04-20 21:46 [tip:x86/mm] x86/mm/gup: Switch GUP to the generic get_user_page_fast() implementation Dan Williams
2017-04-21 14:16 ` Kirill A. Shutemov
2017-04-21 19:30   ` Dan Williams
2017-04-23  9:52     ` [PATCH] Revert "x86/mm/gup: Switch GUP to the generic get_user_page_fast() implementation" Ingo Molnar
2017-04-23 23:31 ` get_zone_device_page() in get_page() and page_cache_get_speculative() Kirill A. Shutemov
2017-04-24 17:23   ` Dan Williams
2017-04-24 17:30     ` Kirill A. Shutemov
2017-04-24 17:47       ` Dan Williams
2017-04-24 18:01         ` Kirill A. Shutemov
2017-04-24 18:25           ` Kirill A. Shutemov
2017-04-24 18:41             ` Dan Williams
2017-04-25 13:19               ` Kirill A. Shutemov
2017-04-25 16:44                 ` Dan Williams [this message]
2017-04-27  0:55   ` [PATCH] mm, zone_device: replace {get, put}_zone_device_page() with a single reference Dan Williams
2017-04-27  8:33     ` Kirill A. Shutemov
2017-04-28  6:39       ` Ingo Molnar
2017-04-28  8:14         ` [PATCH] mm, zone_device: Replace " Kirill A. Shutemov
2017-04-28 17:23         ` [PATCH v2] mm, zone_device: replace " Dan Williams
2017-04-28 17:34           ` Jerome Glisse
2017-04-28 17:41             ` Dan Williams
2017-04-28 18:00               ` Jerome Glisse
2017-04-28 19:02                 ` Dan Williams
2017-04-28 19:16                   ` Jerome Glisse
2017-04-28 19:22                     ` Dan Williams
2017-04-28 19:33                       ` Jerome Glisse
2017-04-29 10:17                         ` Kirill A. Shutemov
2017-04-30 23:14                           ` Jerome Glisse
2017-05-01  1:42                             ` Dan Williams
2017-05-01  1:54                               ` Jerome Glisse
2017-05-01  2:40                                 ` Dan Williams
2017-05-01  3:48                             ` Logan Gunthorpe
2017-05-01 10:23                             ` Kirill A. Shutemov
2017-05-01 13:55                               ` Jerome Glisse
2017-05-01 20:19                                 ` Dan Williams
2017-05-01 20:32                                   ` Jerome Glisse
2017-05-02 11:37                                 ` Kirill A. Shutemov
2017-05-02 13:22                                   ` Jerome Glisse
2017-04-29 14:18           ` Ingo Molnar
2017-05-01  2:45             ` Dan Williams
2017-05-01  7:12               ` Ingo Molnar
2017-05-01  9:33                 ` Kirill A. Shutemov
2017-05-01  8:28           ` [tip:x86/mm] mm, zone_device: Replace {get, put}_zone_device_page() with a single reference to fix pmem crash tip-bot for Dan Williams
2017-04-27 16:11     ` [PATCH] mm, zone_device: replace {get, put}_zone_device_page() with a single reference Logan Gunthorpe
2017-04-27 16:14       ` Dan Williams
2017-04-27 16:33         ` Logan Gunthorpe
2017-04-27 16:38           ` Dan Williams
2017-04-27 16:45             ` Logan Gunthorpe
2017-04-27 16:46               ` Dan Williams

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=CAPcyv4h7+Rgs83JefhJajHtitPWUFEKKgUt-_e-bqhQZM5L2FA@mail.gmail.com \
    --to=dan.j.williams@intel.com \
    --cc=akpm@linux-foundation.org \
    --cc=aneesh.kumar@linux.vnet.ibm.com \
    --cc=bp@alien8.de \
    --cc=catalin.marinas@arm.com \
    --cc=dann.frazier@canonical.com \
    --cc=dave.hansen@intel.com \
    --cc=hpa@zytor.com \
    --cc=kirill.shutemov@linux.intel.com \
    --cc=kirill@shutemov.name \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=linux-tip-commits@vger.kernel.org \
    --cc=mhocko@suse.cz \
    --cc=mingo@kernel.org \
    --cc=peterz@infradead.org \
    --cc=riel@redhat.com \
    --cc=steve.capper@linaro.org \
    --cc=tglx@linutronix.de \
    --cc=torvalds@linux-foundation.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).