linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: YASUAKI ISHIMATSU <yasu.isimatu@gmail.com>
To: Michal Hocko <mhocko@kernel.org>
Cc: Igor Mammedov <imammedo@redhat.com>,
	Heiko Carstens <heiko.carstens@de.ibm.com>,
	Vitaly Kuznetsov <vkuznets@redhat.com>,
	linux-mm@kvack.org, Andrew Morton <akpm@linux-foundation.org>,
	Greg KH <gregkh@linuxfoundation.org>,
	"K. Y. Srinivasan" <kys@microsoft.com>,
	David Rientjes <rientjes@google.com>,
	Daniel Kiper <daniel.kiper@oracle.com>,
	linux-api@vger.kernel.org, LKML <linux-kernel@vger.kernel.org>,
	linux-s390@vger.kernel.org, xen-devel@lists.xenproject.org,
	linux-acpi@vger.kernel.org, qiuxishi@huawei.com,
	toshi.kani@hpe.com, xieyisheng1@huawei.com, slaoub@gmail.com,
	iamjoonsoo.kim@lge.com, vbabka@suse.cz,
	Reza Arbab <arbab@linux.vnet.ibm.com>,
	yasu.isimatu@gmail.com
Subject: Re: WTH is going on with memory hotplug sysf interface
Date: Tue, 14 Mar 2017 12:05:59 -0400	[thread overview]
Message-ID: <99f14975-f89f-4484-6ae1-296b242d4bf9@gmail.com> (raw)
In-Reply-To: <20170313091907.GF31518@dhcp22.suse.cz>



On 03/13/2017 05:19 AM, Michal Hocko wrote:
> On Fri 10-03-17 12:39:27, Yasuaki Ishimatsu wrote:
>> On 03/10/2017 08:58 AM, Michal Hocko wrote:
> [...]
>>> OK so I did with -m 2G,slots=4,maxmem=4G -numa node,mem=1G -numa node,mem=1G which generated
>>> [...]
>>> [    0.000000] ACPI: SRAT: Node 0 PXM 0 [mem 0x00000000-0x0009ffff]
>>> [    0.000000] ACPI: SRAT: Node 0 PXM 0 [mem 0x00100000-0x3fffffff]
>>> [    0.000000] ACPI: SRAT: Node 1 PXM 1 [mem 0x40000000-0x7fffffff]
>>> [    0.000000] ACPI: SRAT: Node 0 PXM 0 [mem 0x100000000-0x27fffffff] hotplug
>>> [    0.000000] NUMA: Node 0 [mem 0x00000000-0x0009ffff] + [mem 0x00100000-0x3fffffff] -> [mem 0x00000000-0x3fffffff]
>>> [    0.000000] NODE_DATA(0) allocated [mem 0x3fffc000-0x3fffffff]
>>> [    0.000000] NODE_DATA(1) allocated [mem 0x7ffdc000-0x7ffdffff]
>>> [    0.000000] Zone ranges:
>>> [    0.000000]   DMA      [mem 0x0000000000001000-0x0000000000ffffff]
>>> [    0.000000]   DMA32    [mem 0x0000000001000000-0x000000007ffdffff]
>>> [    0.000000]   Normal   empty
>>> [    0.000000] Movable zone start for each node
>>> [    0.000000] Early memory node ranges
>>> [    0.000000]   node   0: [mem 0x0000000000001000-0x000000000009efff]
>>> [    0.000000]   node   0: [mem 0x0000000000100000-0x000000003fffffff]
>>> [    0.000000]   node   1: [mem 0x0000000040000000-0x000000007ffdffff]
>>>
>>> so there is neither any normal zone nor movable one at the boot time.
>>> Then I hotplugged 1G slot
>>> (qemu) object_add memory-backend-ram,id=mem1,size=1G
>>> (qemu) device_add pc-dimm,id=dimm1,memdev=mem1
>>>
>>> unfortunatelly the memory didn't show up automatically and I got
>>> [  116.375781] acpi PNP0C80:00: Enumeration failure
>>>
>>> so I had to probe it manually (prbably the BIOS my qemu uses doesn't
>>> support auto probing - I haven't really dug further). Anyway the SRAT
>>> table printed during the boot told that we should start at 0x100000000
>>>
>>> # echo 0x100000000 > /sys/devices/system/memory/probe
>>> # grep . /sys/devices/system/memory/memory32/valid_zones
>>> Normal Movable
>>>
>>> which looks reasonably right? Both Normal and Movable zones are allowed
>>>
>>> # echo $((0x100000000+(128<<20))) > /sys/devices/system/memory/probe
>>> # grep . /sys/devices/system/memory/memory3?/valid_zones
>>> /sys/devices/system/memory/memory32/valid_zones:Normal
>>> /sys/devices/system/memory/memory33/valid_zones:Normal Movable
>>>
>>> Huh, so our valid_zones have changed under our feet...
>>>
>>> # echo $((0x100000000+2*(128<<20))) > /sys/devices/system/memory/probe
>>> # grep . /sys/devices/system/memory/memory3?/valid_zones
>>> /sys/devices/system/memory/memory32/valid_zones:Normal
>>> /sys/devices/system/memory/memory33/valid_zones:Normal
>>> /sys/devices/system/memory/memory34/valid_zones:Normal Movable
>>>
>>> and again. So only the last memblock is considered movable. Let's try to
>>> online them now.
>>>
>>> # echo online_movable > /sys/devices/system/memory/memory34/state
>>> # grep . /sys/devices/system/memory/memory3?/valid_zones
>>> /sys/devices/system/memory/memory32/valid_zones:Normal
>>> /sys/devices/system/memory/memory33/valid_zones:Normal Movable
>>> /sys/devices/system/memory/memory34/valid_zones:Movable Normal
>>>
>>
>> I think there is no strong reason which kernel has the restriction.
>> By setting the restrictions, it seems to have made management of
>> these zone structs simple.
>
> Could you be more specific please? How could this make management any
> easier when udev is basically racing with the physical hotplug and the
> result is basically undefined?
>

When changing zone from NORMAL(N) to MOVALBE(M), we must resize both zones,
zone->zone_start_pfn and zone->spanned_pages. Currently there is the
restriction.

So we just simply change:
   zone(N)->spanned_pages -= nr_pages
   zone(M)->zone_start_pfn -= nr_pages

But if every memory can change zone with no restriction, we must recalculate
these zones spanned_pages and zone_start_pfn follows:

   memory section #
    1 2 3 4 5 6 7
   |N|M|N|N|N|M|M|
      |
   |N|N|N|N|N|M|M|
  * change memory section #2 from MOVABLE to NORMAL.
    then we must find next movable memory section (#6) and resize these zones.

I think when implementing movable memory, there is no requirement of this.
So kernel has the current restriction.

Thanks,
Yasuaki Ishimatsu

  reply	other threads:[~2017-03-14 16:06 UTC|newest]

Thread overview: 47+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-02-27  9:28 [RFC PATCH] mm, hotplug: get rid of auto_online_blocks Michal Hocko
2017-02-27 10:02 ` Vitaly Kuznetsov
2017-02-27 10:21   ` Michal Hocko
2017-02-27 10:49     ` Vitaly Kuznetsov
2017-02-27 12:56       ` Michal Hocko
2017-02-27 13:17         ` Vitaly Kuznetsov
2017-02-27 11:25   ` Heiko Carstens
2017-02-27 11:50     ` Vitaly Kuznetsov
2017-02-27 15:43     ` Michal Hocko
2017-02-28 10:21       ` Heiko Carstens
2017-03-02 13:53       ` Igor Mammedov
2017-03-02 14:28         ` Michal Hocko
2017-03-02 17:03           ` Igor Mammedov
2017-03-03  8:27             ` Michal Hocko
2017-03-03 17:34               ` Igor Mammedov
2017-03-06 14:54                 ` Michal Hocko
2017-03-07 12:40                   ` Igor Mammedov
2017-03-09 12:54                     ` Michal Hocko
2017-03-10 13:58                       ` WTH is going on with memory hotplug sysf interface (was: Re: [RFC PATCH] mm, hotplug: get rid of auto_online_blocks) Michal Hocko
2017-03-10 15:53                         ` Michal Hocko
2017-03-10 19:00                           ` Reza Arbab
2017-03-13  9:21                             ` Michal Hocko
2017-03-13 14:58                               ` Reza Arbab
2017-03-14 19:35                               ` Andrea Arcangeli
2017-03-15  7:57                                 ` Michal Hocko
2017-03-13 15:11                           ` Michal Hocko
2017-03-13 23:16                             ` Andi Kleen
2017-03-10 17:39                         ` WTH is going on with memory hotplug sysf interface Yasuaki Ishimatsu
2017-03-13  9:19                           ` Michal Hocko
2017-03-14 16:05                             ` YASUAKI ISHIMATSU [this message]
2017-03-14 16:20                               ` Michal Hocko
2017-03-13 10:31                         ` WTH is going on with memory hotplug sysf interface (was: Re: [RFC PATCH] mm, hotplug: get rid of auto_online_blocks) Igor Mammedov
2017-03-13 10:43                           ` Michal Hocko
2017-03-13 13:57                             ` Igor Mammedov
2017-03-13 14:36                               ` Michal Hocko
2017-03-13 10:55                       ` [RFC PATCH] mm, hotplug: get rid of auto_online_blocks Igor Mammedov
2017-03-13 12:28                         ` Michal Hocko
2017-03-13 12:54                           ` Vitaly Kuznetsov
2017-03-13 13:19                             ` Michal Hocko
2017-03-13 13:42                               ` Vitaly Kuznetsov
2017-03-13 14:32                                 ` Michal Hocko
2017-03-13 15:10                                   ` Vitaly Kuznetsov
2017-03-14 13:20                           ` Igor Mammedov
2017-03-15  7:53                             ` Michal Hocko
2017-03-10 22:00                   ` Daniel Kiper
2017-02-27 17:28 ` Reza Arbab
2017-02-27 17:34   ` Michal Hocko

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=99f14975-f89f-4484-6ae1-296b242d4bf9@gmail.com \
    --to=yasu.isimatu@gmail.com \
    --cc=akpm@linux-foundation.org \
    --cc=arbab@linux.vnet.ibm.com \
    --cc=daniel.kiper@oracle.com \
    --cc=gregkh@linuxfoundation.org \
    --cc=heiko.carstens@de.ibm.com \
    --cc=iamjoonsoo.kim@lge.com \
    --cc=imammedo@redhat.com \
    --cc=kys@microsoft.com \
    --cc=linux-acpi@vger.kernel.org \
    --cc=linux-api@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=linux-s390@vger.kernel.org \
    --cc=mhocko@kernel.org \
    --cc=qiuxishi@huawei.com \
    --cc=rientjes@google.com \
    --cc=slaoub@gmail.com \
    --cc=toshi.kani@hpe.com \
    --cc=vbabka@suse.cz \
    --cc=vkuznets@redhat.com \
    --cc=xen-devel@lists.xenproject.org \
    --cc=xieyisheng1@huawei.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).