LinuxPPC-Dev Archive on lore.kernel.org
 help / color / Atom feed
From: Dan Williams <dan.j.williams@intel.com>
To: David Hildenbrand <david@redhat.com>
Cc: linux-hyperv@vger.kernel.org, "Michal Hocko" <mhocko@suse.com>,
	"Radim Krčmář" <rkrcmar@redhat.com>,
	"KVM list" <kvm@vger.kernel.org>,
	"Pavel Tatashin" <pavel.tatashin@microsoft.com>,
	"KarimAllah Ahmed" <karahmed@amazon.de>,
	"Dave Hansen" <dave.hansen@linux.intel.com>,
	"Alexander Duyck" <alexander.duyck@gmail.com>,
	"Michal Hocko" <mhocko@kernel.org>,
	"Linux MM" <linux-mm@kvack.org>,
	"Paul Mackerras" <paulus@samba.org>,
	"H. Peter Anvin" <hpa@zytor.com>,
	"Wanpeng Li" <wanpengli@tencent.com>,
	"Alexander Duyck" <alexander.h.duyck@linux.intel.com>,
	"K. Y. Srinivasan" <kys@microsoft.com>,
	"Thomas Gleixner" <tglx@linutronix.de>,
	"Kees Cook" <keescook@chromium.org>,
	devel@driverdev.osuosl.org,
	"Stefano Stabellini" <sstabellini@kernel.org>,
	"Stephen Hemminger" <sthemmin@microsoft.com>,
	"Aneesh Kumar K.V" <aneesh.kumar@linux.ibm.com>,
	"Joerg Roedel" <joro@8bytes.org>, "X86 ML" <x86@kernel.org>,
	YueHaibing <yuehaibing@huawei.com>,
	"Matthew Wilcox (Oracle)" <willy@infradead.org>,
	"Mike Rapoport" <rppt@linux.ibm.com>,
	"Peter Zijlstra" <peterz@infradead.org>,
	"Ingo Molnar" <mingo@redhat.com>,
	"Vlastimil Babka" <vbabka@suse.cz>,
	"Anthony Yznaga" <anthony.yznaga@oracle.com>,
	"Oscar Salvador" <osalvador@suse.de>,
	"Isaac J. Manjarres" <isaacm@codeaurora.org>,
	"Matt Sickler" <Matt.Sickler@daktronics.com>,
	"Juergen Gross" <jgross@suse.com>,
	"Anshuman Khandual" <anshuman.khandual@arm.com>,
	"Haiyang Zhang" <haiyangz@microsoft.com>,
	"Sasha Levin" <sashal@kernel.org>,
	kvm-ppc@vger.kernel.org, "Qian Cai" <cai@lca.pw>,
	"Alex Williamson" <alex.williamson@redhat.com>,
	"Mike Rapoport" <rppt@linux.vnet.ibm.com>,
	"Borislav Petkov" <bp@alien8.de>,
	"Nicholas Piggin" <npiggin@gmail.com>,
	"Andy Lutomirski" <luto@kernel.org>,
	xen-devel <xen-devel@lists.xenproject.org>,
	"Boris Ostrovsky" <boris.ostrovsky@oracle.com>,
	"Vitaly Kuznetsov" <vkuznets@redhat.com>,
	"Allison Randal" <allison@lohutok.net>,
	"Jim Mattson" <jmattson@google.com>,
	"Mel Gorman" <mgorman@techsingularity.net>,
	"Cornelia Huck" <cohuck@redhat.com>,
	"Pavel Tatashin" <pasha.tatashin@soleen.com>,
	"Linux Kernel Mailing List" <linux-kernel@vger.kernel.org>,
	"Sean Christopherson" <sean.j.christopherson@intel.com>,
	"Johannes Weiner" <hannes@cmpxchg.org>,
	"Paolo Bonzini" <pbonzini@redhat.com>,
	"Andrew Morton" <akpm@linux-foundation.org>,
	linuxppc-dev <linuxppc-dev@lists.ozlabs.org>
Subject: Re: [PATCH v1 01/10] mm/memory_hotplug: Don't allow to online/offline memory blocks with holes
Date: Mon, 4 Nov 2019 17:30:25 -0800
Message-ID: <CAPcyv4hT58=SDWYO2vrktdFOnDfWveVwN4ZBxNQ8=500_Zu7tQ@mail.gmail.com> (raw)
In-Reply-To: <20191024120938.11237-2-david@redhat.com>

On Thu, Oct 24, 2019 at 5:10 AM David Hildenbrand <david@redhat.com> wrote:
>
> Our onlining/offlining code is unnecessarily complicated. Only memory
> blocks added during boot can have holes (a range that is not
> IORESOURCE_SYSTEM_RAM). Hotplugged memory never has holes (e.g., see
> add_memory_resource()). All boot memory is alread online.

s/alread/already/

...also perhaps clarify "already online" by what point in time and why
that is relevant. For example a description of the difference between
the SetPageReserved() in the bootmem path and the one in the hotplug
path.

> Therefore, when we stop allowing to offline memory blocks with holes, we
> implicitly no longer have to deal with onlining memory blocks with holes.

Maybe an explicit reference of the code areas that deal with holes
would help to back up that assertion. Certainly it would have saved me
some time for the review.

> This allows to simplify the code. For example, we no longer have to
> worry about marking pages that fall into memory holes PG_reserved when
> onlining memory. We can stop setting pages PG_reserved.

...but not for bootmem, right?

>
> Offlining memory blocks added during boot is usually not guranteed to work

s/guranteed/guaranteed/

> either way (unmovable data might have easily ended up on that memory during
> boot). So stopping to do that should not really hurt (+ people are not
> even aware of a setup where that used to work

Maybe put a "Link: https://lkml.kernel.org/r/$msg_id" to that discussion?

> and that the existing code
> still works correctly with memory holes). For the use case of offlining
> memory to unplug DIMMs, we should see no change. (holes on DIMMs would be
> weird).

However, less memory can be offlined than was theoretically allowed
previously, so I don't understand the "we should see no change"
comment. I still agree that's a price worth paying to get the code
cleanups and if someone screams we can look at adding it back, but the
fact that it was already fragile seems decent enough protection.

>
> Please note that hardware errors (PG_hwpoison) are not memory holes and
> not affected by this change when offlining.
>
> Cc: Andrew Morton <akpm@linux-foundation.org>
> Cc: Michal Hocko <mhocko@suse.com>
> Cc: Oscar Salvador <osalvador@suse.de>
> Cc: Pavel Tatashin <pasha.tatashin@soleen.com>
> Cc: Dan Williams <dan.j.williams@intel.com>
> Cc: Anshuman Khandual <anshuman.khandual@arm.com>
> Signed-off-by: David Hildenbrand <david@redhat.com>
> ---
>  mm/memory_hotplug.c | 26 ++++++++++++++++++++++++--
>  1 file changed, 24 insertions(+), 2 deletions(-)
>
> diff --git a/mm/memory_hotplug.c b/mm/memory_hotplug.c
> index 561371ead39a..8d81730cf036 100644
> --- a/mm/memory_hotplug.c
> +++ b/mm/memory_hotplug.c
> @@ -1447,10 +1447,19 @@ static void node_states_clear_node(int node, struct memory_notify *arg)
>                 node_clear_state(node, N_MEMORY);
>  }
>
> +static int count_system_ram_pages_cb(unsigned long start_pfn,
> +                                    unsigned long nr_pages, void *data)
> +{
> +       unsigned long *nr_system_ram_pages = data;
> +
> +       *nr_system_ram_pages += nr_pages;
> +       return 0;
> +}
> +
>  static int __ref __offline_pages(unsigned long start_pfn,
>                   unsigned long end_pfn)
>  {
> -       unsigned long pfn, nr_pages;
> +       unsigned long pfn, nr_pages = 0;
>         unsigned long offlined_pages = 0;
>         int ret, node, nr_isolate_pageblock;
>         unsigned long flags;
> @@ -1461,6 +1470,20 @@ static int __ref __offline_pages(unsigned long start_pfn,
>
>         mem_hotplug_begin();
>
> +       /*
> +        * Don't allow to offline memory blocks that contain holes.
> +        * Consecuently, memory blocks with holes can never get onlined

s/Consecuently/Consequently/

> +        * (hotplugged memory has no holes and all boot memory is online).
> +        * This allows to simplify the onlining/offlining code quite a lot.
> +        */

The last sentence of this comment makes sense in the context of this
patch, but I don't think it stands by itself in the code base after
the fact. The person reading the comment can't see the simplifications
because the code is already gone. I'd clarify it to talk about why it
is safe to not mess around with PG_Reserved in the hotplug path
because of this check.

After those clarifications you can add:

Reviewed-by: Dan Williams <dan.j.williams@intel.com>

  reply index

Thread overview: 43+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-10-24 12:09 [PATCH v1 00/10] mm: Don't mark hotplugged pages PG_reserved (including ZONE_DEVICE) David Hildenbrand
2019-10-24 12:09 ` [PATCH v1 01/10] mm/memory_hotplug: Don't allow to online/offline memory blocks with holes David Hildenbrand
2019-11-05  1:30   ` Dan Williams [this message]
2019-11-05  9:31     ` David Hildenbrand
2019-10-24 12:09 ` [PATCH v1 02/10] KVM: x86/mmu: Prepare kvm_is_mmio_pfn() for PG_reserved changes David Hildenbrand
2019-11-05  1:37   ` Dan Williams
2019-11-05 11:09     ` David Hildenbrand
2019-10-24 12:09 ` [PATCH v1 03/10] KVM: Prepare kvm_is_reserved_pfn() " David Hildenbrand
2019-11-05  4:38   ` Dan Williams
2019-11-05  9:17     ` David Hildenbrand
2019-11-05  9:49       ` David Hildenbrand
2019-11-05 10:02         ` David Hildenbrand
2019-11-05 16:00           ` Sean Christopherson
2019-11-05 20:30             ` David Hildenbrand
2019-11-05 22:22               ` Sean Christopherson
2019-11-05 23:02               ` Dan Williams
2019-11-05 23:13                 ` Sean Christopherson
2019-11-05 23:30                   ` Dan Williams
2019-11-05 23:42                     ` Sean Christopherson
2019-11-05 23:43                     ` Dan Williams
2019-11-06  0:03                       ` Sean Christopherson
2019-11-06  0:08                         ` Dan Williams
2019-11-06  6:56                           ` David Hildenbrand
2019-11-06 16:09                             ` Sean Christopherson
2019-10-24 12:09 ` [PATCH v1 04/10] vfio/type1: Prepare is_invalid_reserved_pfn() " David Hildenbrand
2019-11-07 15:40   ` Dan Williams
2019-11-07 18:22     ` David Hildenbrand
2019-11-07 22:07       ` David Hildenbrand
2019-11-08  5:09         ` Dan Williams
2019-11-08  7:14           ` David Hildenbrand
2019-11-08 10:21             ` David Hildenbrand
2019-11-08 18:29               ` Dan Williams
2019-11-08 23:01                 ` David Hildenbrand
2019-10-24 12:09 ` [PATCH v1 05/10] powerpc/book3s: Prepare kvmppc_book3s_instantiate_page() " David Hildenbrand
2019-10-24 12:09 ` [PATCH v1 06/10] powerpc/64s: Prepare hash_page_do_lazy_icache() " David Hildenbrand
2019-10-24 12:09 ` [PATCH v1 07/10] powerpc/mm: Prepare maybe_pte_to_page() " David Hildenbrand
2019-10-24 12:09 ` [PATCH v1 08/10] x86/mm: Prepare __ioremap_check_ram() " David Hildenbrand
2019-10-24 12:09 ` [PATCH v1 09/10] mm/memory_hotplug: Don't mark pages PG_reserved when initializing the memmap David Hildenbrand
2019-11-04 22:44   ` Boris Ostrovsky
2019-11-05 10:18     ` David Hildenbrand
2019-11-05 16:06       ` Boris Ostrovsky
2019-10-24 12:09 ` [PATCH v1 10/10] mm/usercopy.c: Update comment in check_page_span() regarding ZONE_DEVICE David Hildenbrand
2019-11-01 19:24 ` [PATCH v1 00/10] mm: Don't mark hotplugged pages PG_reserved (including ZONE_DEVICE) David Hildenbrand

Reply instructions:

You may reply publically to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='CAPcyv4hT58=SDWYO2vrktdFOnDfWveVwN4ZBxNQ8=500_Zu7tQ@mail.gmail.com' \
    --to=dan.j.williams@intel.com \
    --cc=Matt.Sickler@daktronics.com \
    --cc=akpm@linux-foundation.org \
    --cc=alex.williamson@redhat.com \
    --cc=alexander.duyck@gmail.com \
    --cc=alexander.h.duyck@linux.intel.com \
    --cc=allison@lohutok.net \
    --cc=aneesh.kumar@linux.ibm.com \
    --cc=anshuman.khandual@arm.com \
    --cc=anthony.yznaga@oracle.com \
    --cc=boris.ostrovsky@oracle.com \
    --cc=bp@alien8.de \
    --cc=cai@lca.pw \
    --cc=cohuck@redhat.com \
    --cc=dave.hansen@linux.intel.com \
    --cc=david@redhat.com \
    --cc=devel@driverdev.osuosl.org \
    --cc=haiyangz@microsoft.com \
    --cc=hannes@cmpxchg.org \
    --cc=hpa@zytor.com \
    --cc=isaacm@codeaurora.org \
    --cc=jgross@suse.com \
    --cc=jmattson@google.com \
    --cc=joro@8bytes.org \
    --cc=karahmed@amazon.de \
    --cc=keescook@chromium.org \
    --cc=kvm-ppc@vger.kernel.org \
    --cc=kvm@vger.kernel.org \
    --cc=kys@microsoft.com \
    --cc=linux-hyperv@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=linuxppc-dev@lists.ozlabs.org \
    --cc=luto@kernel.org \
    --cc=mgorman@techsingularity.net \
    --cc=mhocko@kernel.org \
    --cc=mhocko@suse.com \
    --cc=mingo@redhat.com \
    --cc=npiggin@gmail.com \
    --cc=osalvador@suse.de \
    --cc=pasha.tatashin@soleen.com \
    --cc=paulus@samba.org \
    --cc=pavel.tatashin@microsoft.com \
    --cc=pbonzini@redhat.com \
    --cc=peterz@infradead.org \
    --cc=rkrcmar@redhat.com \
    --cc=rppt@linux.ibm.com \
    --cc=rppt@linux.vnet.ibm.com \
    --cc=sashal@kernel.org \
    --cc=sean.j.christopherson@intel.com \
    --cc=sstabellini@kernel.org \
    --cc=sthemmin@microsoft.com \
    --cc=tglx@linutronix.de \
    --cc=vbabka@suse.cz \
    --cc=vkuznets@redhat.com \
    --cc=wanpengli@tencent.com \
    --cc=willy@infradead.org \
    --cc=x86@kernel.org \
    --cc=xen-devel@lists.xenproject.org \
    --cc=yuehaibing@huawei.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

LinuxPPC-Dev Archive on lore.kernel.org

Archives are clonable:
	git clone --mirror https://lore.kernel.org/linuxppc-dev/0 linuxppc-dev/git/0.git

	# If you have public-inbox 1.1+ installed, you may
	# initialize and index your mirror using the following commands:
	public-inbox-init -V2 linuxppc-dev linuxppc-dev/ https://lore.kernel.org/linuxppc-dev \
		linuxppc-dev@lists.ozlabs.org linuxppc-dev@ozlabs.org
	public-inbox-index linuxppc-dev

Example config snippet for mirrors

Newsgroup available over NNTP:
	nntp://nntp.lore.kernel.org/org.ozlabs.lists.linuxppc-dev


AGPL code for this site: git clone https://public-inbox.org/public-inbox.git