All of lore.kernel.org
 help / color / mirror / Atom feed
From: David Hildenbrand <david@redhat.com>
To: Dan Williams <dan.j.williams@intel.com>
Cc: David Hildenbrand <david@redhat.com>,
	Toshiki Fukasawa <t-fukasawa@vx.jp.nec.com>,
	"linux-mm@kvack.org" <linux-mm@kvack.org>,
	"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>,
	"akpm@linux-foundation.org" <akpm@linux-foundation.org>,
	"mhocko@kernel.org" <mhocko@kernel.org>,
	"adobriyan@gmail.com" <adobriyan@gmail.com>,
	"hch@lst.de" <hch@lst.de>,
	"longman@redhat.com" <longman@redhat.com>,
	"sfr@canb.auug.org.au" <sfr@canb.auug.org.au>,
	"mst@redhat.com" <mst@redhat.com>, "cai@lca.pw" <cai@lca.pw>,
	Naoya Horiguchi <n-horiguchi@ah.jp.nec.com>,
	Junichi Nomura <j-nomura@ce.jp.nec.com>
Subject: Re: [PATCH 2/3] mm: Introduce subsection_dev_map
Date: Wed, 13 Nov 2019 20:53:06 +0100	[thread overview]
Message-ID: <6193C847-F09C-439A-81EE-98A59473D582@redhat.com> (raw)
In-Reply-To: <CAPcyv4iPk4bzOCE=7Eq8w-jwUuOXzZP9F=+RcxjqdXCn0SC01A@mail.gmail.com>



> Am 13.11.2019 um 20:06 schrieb Dan Williams <dan.j.williams@intel.com>:
> 
> On Wed, Nov 13, 2019 at 10:51 AM David Hildenbrand <david@redhat.com> wrote:
>> 
>>> On 08.11.19 20:13, Dan Williams wrote:
>>> On Thu, Nov 7, 2019 at 4:15 PM Toshiki Fukasawa
>>> <t-fukasawa@vx.jp.nec.com> wrote:
>>>> 
>>>> Currently, there is no way to identify pfn on ZONE_DEVICE.
>>>> Identifying pfn on system memory can be done by using a
>>>> section-level flag. On the other hand, identifying pfn on
>>>> ZONE_DEVICE requires a subsection-level flag since ZONE_DEVICE
>>>> can be created in units of subsections.
>>>> 
>>>> This patch introduces a new bitmap subsection_dev_map so that
>>>> we can identify pfn on ZONE_DEVICE.
>>>> 
>>>> Also, subsection_dev_map is used to prove that struct pages
>>>> included in the subsection have been initialized since it is
>>>> set after memmap_init_zone_device(). We can avoid accessing
>>>> pages currently being initialized by checking subsection_dev_map.
>>>> 
>>>> Signed-off-by: Toshiki Fukasawa <t-fukasawa@vx.jp.nec.com>
>>>> ---
>>>>  include/linux/mmzone.h | 19 +++++++++++++++++++
>>>>  mm/memremap.c          |  2 ++
>>>>  mm/sparse.c            | 32 ++++++++++++++++++++++++++++++++
>>>>  3 files changed, 53 insertions(+)
>>>> 
>>>> diff --git a/include/linux/mmzone.h b/include/linux/mmzone.h
>>>> index bda2028..11376c4 100644
>>>> --- a/include/linux/mmzone.h
>>>> +++ b/include/linux/mmzone.h
>>>> @@ -1174,11 +1174,17 @@ static inline unsigned long section_nr_to_pfn(unsigned long sec)
>>>> 
>>>>  struct mem_section_usage {
>>>>         DECLARE_BITMAP(subsection_map, SUBSECTIONS_PER_SECTION);
>>>> +#ifdef CONFIG_ZONE_DEVICE
>>>> +       DECLARE_BITMAP(subsection_dev_map, SUBSECTIONS_PER_SECTION);
>>>> +#endif
>>> 
>>> Hi Toshiki,
>>> 
>>> There is currently an effort to remove the PageReserved() flag as some
>>> code is using that to detect ZONE_DEVICE. In reviewing those patches
>>> we realized that what many code paths want is to detect online memory.
>>> So instead of a subsection_dev_map add a subsection_online_map. That
>>> way pfn_to_online_page() can reliably avoid ZONE_DEVICE ranges. I
>>> otherwise question the use case for pfn_walkers to return pages for
>>> ZONE_DEVICE pages, I think the skip behavior when pfn_to_online_page()
>>> == false is the right behavior.
>> 
>> To be more precise, I recommended an subsection_active_map, to indicate
>> which memmaps were fully initialized and can safely be touched (e.g., to
>> read the zone/nid). This map would also be set when the devmem memmaps
>> were initialized (race between adding memory/growing the section and
>> initializing the memmap).
>> 
>> See
>> 
>> https://lkml.org/lkml/2019/10/10/87
>> 
>> and
>> 
>> https://www.spinics.net/lists/linux-driver-devel/msg130012.html
> 
> I'm still struggling to understand the motivation of distinguishing
> "active" as something distinct from "online". As long as the "online"
> granularity is improved from sections down to subsections then most
> code paths are good to go. The others can use get_devpagemap() to
> check for ZONE_DEVICE in a race free manner as they currently do.

I thought we wanted to unify access if we don’t really care about the zone as in most pfn walkers - E.g., for zone shrinking. Anyhow, a subsection online map would be a good start, we can reuse that later for ZONE_DEVICE as well.

> 
>> I dislike a map that is specific to ZONE_DEVICE or (currently)
>> !ZONE_DEVICE. I rather want an indication "this memmap is safe to
>> touch". As discussed along the mentioned threads, we can combine this
>> later with RCU to handle some races that are currently possible.
> 
> The rcu protection is independent of the pfn_active vs pfn_online
> distinction afaics.

It’s one part of the bigger picture IMHO.

> 


  reply	other threads:[~2019-11-13 19:53 UTC|newest]

Thread overview: 30+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-11-08  0:08 [PATCH 0/3] make pfn walker support ZONE_DEVICE Toshiki Fukasawa
2019-11-08  0:08 ` [PATCH 1/3] procfs: refactor kpage_*_read() in fs/proc/page.c Toshiki Fukasawa
2019-11-08  0:08 ` [PATCH 2/3] mm: Introduce subsection_dev_map Toshiki Fukasawa
2019-11-08 19:13   ` Dan Williams
2019-11-08 19:13     ` Dan Williams
2019-11-13 18:51     ` David Hildenbrand
2019-11-13 19:06       ` Dan Williams
2019-11-13 19:06         ` Dan Williams
2019-11-13 19:53         ` David Hildenbrand [this message]
2019-11-13 20:10           ` Dan Williams
2019-11-13 20:10             ` Dan Williams
2019-11-13 20:23             ` David Hildenbrand
2019-11-13 20:40               ` David Hildenbrand
2019-11-13 21:11                 ` Dan Williams
2019-11-13 21:11                   ` Dan Williams
2019-11-13 21:22                   ` David Hildenbrand
2019-11-13 21:26                     ` Dan Williams
2019-11-13 21:26                       ` Dan Williams
2019-11-14 23:36                       ` Toshiki Fukasawa
2019-11-15  0:46                         ` David Hildenbrand
2019-11-15  2:57                           ` Toshiki Fukasawa
2019-11-08  0:08 ` [PATCH 3/3] mm: make pfn walker support ZONE_DEVICE Toshiki Fukasawa
2019-11-09 17:08   ` kbuild test robot
2019-11-09 17:08     ` kbuild test robot
2019-11-09 19:14   ` kbuild test robot
2019-11-09 19:14     ` kbuild test robot
2019-11-08  9:18 ` [PATCH 0/3] " Michal Hocko
2019-11-11  8:00   ` Toshiki Fukasawa
2019-11-11 16:23     ` Dan Williams
2019-11-11 16:23       ` Dan Williams

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=6193C847-F09C-439A-81EE-98A59473D582@redhat.com \
    --to=david@redhat.com \
    --cc=adobriyan@gmail.com \
    --cc=akpm@linux-foundation.org \
    --cc=cai@lca.pw \
    --cc=dan.j.williams@intel.com \
    --cc=hch@lst.de \
    --cc=j-nomura@ce.jp.nec.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=longman@redhat.com \
    --cc=mhocko@kernel.org \
    --cc=mst@redhat.com \
    --cc=n-horiguchi@ah.jp.nec.com \
    --cc=sfr@canb.auug.org.au \
    --cc=t-fukasawa@vx.jp.nec.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.