From: Igor Mammedov <imammedo@redhat.com> To: Michal Hocko <mhocko@kernel.org> Cc: Heiko Carstens <heiko.carstens@de.ibm.com>, Vitaly Kuznetsov <vkuznets@redhat.com>, linux-mm@kvack.org, Andrew Morton <akpm@linux-foundation.org>, Greg KH <gregkh@linuxfoundation.org>, "K. Y. Srinivasan" <kys@microsoft.com>, David Rientjes <rientjes@google.com>, Daniel Kiper <daniel.kiper@oracle.com>, linux-api@vger.kernel.org, LKML <linux-kernel@vger.kernel.org>, linux-s390@vger.kernel.org, xen-devel@lists.xenproject.org, linux-acpi@vger.kernel.org, qiuxishi@huawei.com, toshi.kani@hpe.com, xieyisheng1@huawei.com, slaoub@gmail.com, iamjoonsoo.kim@lge.com, vbabka@suse.cz, Zhang Zhen <zhenzhang.zhang@huawei.com>, Reza Arbab <arbab@linux.vnet.ibm.com>, Yasuaki Ishimatsu <yasu.isimatu@gmail.com>, Tang Chen <tangchen@cn.fujitsu.com> Subject: Re: WTH is going on with memory hotplug sysf interface (was: Re: [RFC PATCH] mm, hotplug: get rid of auto_online_blocks) Date: Mon, 13 Mar 2017 11:31:10 +0100 [thread overview] Message-ID: <20170313113110.6a9636a1@nial.brq.redhat.com> (raw) In-Reply-To: <20170310135807.GI3753@dhcp22.suse.cz> On Fri, 10 Mar 2017 14:58:07 +0100 Michal Hocko <mhocko@kernel.org> wrote: > Let's CC people touching this logic. A short summary is that onlining > memory via udev is currently unusable for online_movable because blocks > are added from lower addresses while movable blocks are allowed from > last blocks. More below. > > On Thu 09-03-17 13:54:00, Michal Hocko wrote: > > On Tue 07-03-17 13:40:04, Igor Mammedov wrote: > > > On Mon, 6 Mar 2017 15:54:17 +0100 > > > Michal Hocko <mhocko@kernel.org> wrote: > > > > > > > On Fri 03-03-17 18:34:22, Igor Mammedov wrote: > > [...] > > > > > in current mainline kernel it triggers following code path: > > > > > > > > > > online_pages() > > > > > ... > > > > > if (online_type == MMOP_ONLINE_KERNEL) { > > > > > if (!zone_can_shift(pfn, nr_pages, ZONE_NORMAL, &zone_shift)) > > > > > return -EINVAL; > > > > > > > > Are you sure? I would expect MMOP_ONLINE_MOVABLE here > > > pretty much, reproducer is above so try and see for yourself > > > > I will play with this... > > OK so I did with -m 2G,slots=4,maxmem=4G -numa node,mem=1G -numa node,mem=1G which generated 'mem' here distributes boot memory specified by "-m 2G" and does not include memory specified by -device pc-dimm. > [...] > [ 0.000000] ACPI: SRAT: Node 0 PXM 0 [mem 0x00000000-0x0009ffff] > [ 0.000000] ACPI: SRAT: Node 0 PXM 0 [mem 0x00100000-0x3fffffff] > [ 0.000000] ACPI: SRAT: Node 1 PXM 1 [mem 0x40000000-0x7fffffff] > [ 0.000000] ACPI: SRAT: Node 0 PXM 0 [mem 0x100000000-0x27fffffff] hotplug > [ 0.000000] NUMA: Node 0 [mem 0x00000000-0x0009ffff] + [mem 0x00100000-0x3fffffff] -> [mem 0x00000000-0x3fffffff] > [ 0.000000] NODE_DATA(0) allocated [mem 0x3fffc000-0x3fffffff] > [ 0.000000] NODE_DATA(1) allocated [mem 0x7ffdc000-0x7ffdffff] > [ 0.000000] Zone ranges: > [ 0.000000] DMA [mem 0x0000000000001000-0x0000000000ffffff] > [ 0.000000] DMA32 [mem 0x0000000001000000-0x000000007ffdffff] > [ 0.000000] Normal empty > [ 0.000000] Movable zone start for each node > [ 0.000000] Early memory node ranges > [ 0.000000] node 0: [mem 0x0000000000001000-0x000000000009efff] > [ 0.000000] node 0: [mem 0x0000000000100000-0x000000003fffffff] > [ 0.000000] node 1: [mem 0x0000000040000000-0x000000007ffdffff] > > so there is neither any normal zone nor movable one at the boot time. it could be if hotpluggable memory were present at boot time in E802 table (if I remember right when running on hyperv there is movable zone at boot time), but in qemu hotpluggable memory isn't put into E820, so zone is allocated later when memory is enumerated by ACPI subsystem and onlined. It causes less issues wrt movable zone and works for different versions of linux/windows as well. That's where in kernel auto-onlining could be also useful, since user would be able to start-up with with small non removable memory plus several removable DIMMs and have all the memory onlined/available by the time initrd is loaded. (missing piece here is onling removable memory as movable by default). > Then I hotplugged 1G slot > (qemu) object_add memory-backend-ram,id=mem1,size=1G > (qemu) device_add pc-dimm,id=dimm1,memdev=mem1 You can also specify node a pc-dimm goes to with 'node' property if it should go to other then node 0. device_add pc-dimm,id=dimm1,memdev=mem1,node=1 > unfortunatelly the memory didn't show up automatically and I got > [ 116.375781] acpi PNP0C80:00: Enumeration failure it should work, do you have CONFIG_ACPI_HOTPLUG_MEMORY enabled? > so I had to probe it manually (prbably the BIOS my qemu uses doesn't > support auto probing - I haven't really dug further). Anyway the SRAT > table printed during the boot told that we should start at 0x100000000 -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majordomo@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
WARNING: multiple messages have this Message-ID (diff)
From: Igor Mammedov <imammedo@redhat.com> To: Michal Hocko <mhocko@kernel.org> Cc: Heiko Carstens <heiko.carstens@de.ibm.com>, Vitaly Kuznetsov <vkuznets@redhat.com>, linux-mm@kvack.org, Andrew Morton <akpm@linux-foundation.org>, Greg KH <gregkh@linuxfoundation.org>, "K. Y. Srinivasan" <kys@microsoft.com>, David Rientjes <rientjes@google.com>, Daniel Kiper <daniel.kiper@oracle.com>, linux-api@vger.kernel.org, LKML <linux-kernel@vger.kernel.org>, linux-s390@vger.kernel.org, xen-devel@lists.xenproject.org, linux-acpi@vger.kernel.org, qiuxishi@huawei.com, toshi.kani@hpe.com, xieyisheng1@huawei.com, slaoub@gmail.com, iamjoonsoo.kim@lge.com, vbabka@suse.cz, Zhang Zhen <zhenzhang.zhang@huawei.com>, Reza Arbab <arbab@linux.vnet.ibm.com>, Yasuaki Ishimatsu <yasu.isimatu@gmail.com>, Tang Chen <tangchen@cn.fujitsu.com> Subject: Re: WTH is going on with memory hotplug sysf interface (was: Re: [RFC PATCH] mm, hotplug: get rid of auto_online_blocks) Date: Mon, 13 Mar 2017 11:31:10 +0100 [thread overview] Message-ID: <20170313113110.6a9636a1@nial.brq.redhat.com> (raw) In-Reply-To: <20170310135807.GI3753@dhcp22.suse.cz> On Fri, 10 Mar 2017 14:58:07 +0100 Michal Hocko <mhocko@kernel.org> wrote: > Let's CC people touching this logic. A short summary is that onlining > memory via udev is currently unusable for online_movable because blocks > are added from lower addresses while movable blocks are allowed from > last blocks. More below. > > On Thu 09-03-17 13:54:00, Michal Hocko wrote: > > On Tue 07-03-17 13:40:04, Igor Mammedov wrote: > > > On Mon, 6 Mar 2017 15:54:17 +0100 > > > Michal Hocko <mhocko@kernel.org> wrote: > > > > > > > On Fri 03-03-17 18:34:22, Igor Mammedov wrote: > > [...] > > > > > in current mainline kernel it triggers following code path: > > > > > > > > > > online_pages() > > > > > ... > > > > > if (online_type == MMOP_ONLINE_KERNEL) { > > > > > if (!zone_can_shift(pfn, nr_pages, ZONE_NORMAL, &zone_shift)) > > > > > return -EINVAL; > > > > > > > > Are you sure? I would expect MMOP_ONLINE_MOVABLE here > > > pretty much, reproducer is above so try and see for yourself > > > > I will play with this... > > OK so I did with -m 2G,slots=4,maxmem=4G -numa node,mem=1G -numa node,mem=1G which generated 'mem' here distributes boot memory specified by "-m 2G" and does not include memory specified by -device pc-dimm. > [...] > [ 0.000000] ACPI: SRAT: Node 0 PXM 0 [mem 0x00000000-0x0009ffff] > [ 0.000000] ACPI: SRAT: Node 0 PXM 0 [mem 0x00100000-0x3fffffff] > [ 0.000000] ACPI: SRAT: Node 1 PXM 1 [mem 0x40000000-0x7fffffff] > [ 0.000000] ACPI: SRAT: Node 0 PXM 0 [mem 0x100000000-0x27fffffff] hotplug > [ 0.000000] NUMA: Node 0 [mem 0x00000000-0x0009ffff] + [mem 0x00100000-0x3fffffff] -> [mem 0x00000000-0x3fffffff] > [ 0.000000] NODE_DATA(0) allocated [mem 0x3fffc000-0x3fffffff] > [ 0.000000] NODE_DATA(1) allocated [mem 0x7ffdc000-0x7ffdffff] > [ 0.000000] Zone ranges: > [ 0.000000] DMA [mem 0x0000000000001000-0x0000000000ffffff] > [ 0.000000] DMA32 [mem 0x0000000001000000-0x000000007ffdffff] > [ 0.000000] Normal empty > [ 0.000000] Movable zone start for each node > [ 0.000000] Early memory node ranges > [ 0.000000] node 0: [mem 0x0000000000001000-0x000000000009efff] > [ 0.000000] node 0: [mem 0x0000000000100000-0x000000003fffffff] > [ 0.000000] node 1: [mem 0x0000000040000000-0x000000007ffdffff] > > so there is neither any normal zone nor movable one at the boot time. it could be if hotpluggable memory were present at boot time in E802 table (if I remember right when running on hyperv there is movable zone at boot time), but in qemu hotpluggable memory isn't put into E820, so zone is allocated later when memory is enumerated by ACPI subsystem and onlined. It causes less issues wrt movable zone and works for different versions of linux/windows as well. That's where in kernel auto-onlining could be also useful, since user would be able to start-up with with small non removable memory plus several removable DIMMs and have all the memory onlined/available by the time initrd is loaded. (missing piece here is onling removable memory as movable by default). > Then I hotplugged 1G slot > (qemu) object_add memory-backend-ram,id=mem1,size=1G > (qemu) device_add pc-dimm,id=dimm1,memdev=mem1 You can also specify node a pc-dimm goes to with 'node' property if it should go to other then node 0. device_add pc-dimm,id=dimm1,memdev=mem1,node=1 > unfortunatelly the memory didn't show up automatically and I got > [ 116.375781] acpi PNP0C80:00: Enumeration failure it should work, do you have CONFIG_ACPI_HOTPLUG_MEMORY enabled? > so I had to probe it manually (prbably the BIOS my qemu uses doesn't > support auto probing - I haven't really dug further). Anyway the SRAT > table printed during the boot told that we should start at 0x100000000
next prev parent reply other threads:[~2017-03-13 10:31 UTC|newest] Thread overview: 142+ messages / expand[flat|nested] mbox.gz Atom feed top 2017-02-27 9:28 [RFC PATCH] mm, hotplug: get rid of auto_online_blocks Michal Hocko 2017-02-27 9:28 ` Michal Hocko 2017-02-27 10:02 ` Vitaly Kuznetsov 2017-02-27 10:02 ` Vitaly Kuznetsov 2017-02-27 10:02 ` Vitaly Kuznetsov 2017-02-27 10:21 ` Michal Hocko 2017-02-27 10:21 ` Michal Hocko 2017-02-27 10:49 ` Vitaly Kuznetsov 2017-02-27 10:49 ` Vitaly Kuznetsov 2017-02-27 12:56 ` Michal Hocko 2017-02-27 12:56 ` Michal Hocko 2017-02-27 13:17 ` Vitaly Kuznetsov [not found] ` <20170227125636.GB26504-2MMpYkNvuYDjFM9bn6wA6Q@public.gmane.org> 2017-02-27 13:17 ` Vitaly Kuznetsov 2017-02-27 13:17 ` Vitaly Kuznetsov 2017-02-27 13:17 ` Vitaly Kuznetsov 2017-02-27 12:56 ` Michal Hocko 2017-02-27 10:49 ` Vitaly Kuznetsov 2017-02-27 10:21 ` Michal Hocko 2017-02-27 11:25 ` Heiko Carstens 2017-02-27 11:25 ` Heiko Carstens 2017-02-27 11:25 ` Heiko Carstens 2017-02-27 11:50 ` Vitaly Kuznetsov 2017-02-27 11:50 ` Vitaly Kuznetsov 2017-02-27 11:50 ` Vitaly Kuznetsov 2017-02-27 15:43 ` Michal Hocko 2017-02-27 15:43 ` Michal Hocko 2017-02-27 15:43 ` Michal Hocko 2017-02-28 10:21 ` Heiko Carstens 2017-02-28 10:21 ` Heiko Carstens 2017-02-28 10:21 ` Heiko Carstens 2017-03-02 13:53 ` Igor Mammedov 2017-03-02 13:53 ` Igor Mammedov 2017-03-02 13:53 ` Igor Mammedov 2017-03-02 14:28 ` Michal Hocko 2017-03-02 14:28 ` Michal Hocko 2017-03-02 17:03 ` Igor Mammedov 2017-03-02 17:03 ` Igor Mammedov 2017-03-03 8:27 ` Michal Hocko 2017-03-03 8:27 ` Michal Hocko 2017-03-03 17:34 ` Igor Mammedov 2017-03-03 17:34 ` Igor Mammedov 2017-03-06 14:54 ` Michal Hocko 2017-03-06 14:54 ` Michal Hocko 2017-03-06 14:54 ` Michal Hocko 2017-03-07 12:40 ` Igor Mammedov 2017-03-07 12:40 ` Igor Mammedov 2017-03-09 12:54 ` Michal Hocko 2017-03-09 12:54 ` Michal Hocko 2017-03-09 12:54 ` Michal Hocko 2017-03-10 13:58 ` WTH is going on with memory hotplug sysf interface (was: Re: [RFC PATCH] mm, hotplug: get rid of auto_online_blocks) Michal Hocko 2017-03-10 13:58 ` Michal Hocko 2017-03-10 13:58 ` Michal Hocko 2017-03-10 15:53 ` Michal Hocko 2017-03-10 15:53 ` Michal Hocko 2017-03-10 15:53 ` Michal Hocko 2017-03-10 19:00 ` Reza Arbab 2017-03-10 19:00 ` Reza Arbab 2017-03-10 19:00 ` Reza Arbab 2017-03-13 9:21 ` Michal Hocko 2017-03-13 9:21 ` Michal Hocko 2017-03-13 9:21 ` Michal Hocko 2017-03-13 14:58 ` Reza Arbab 2017-03-13 14:58 ` Reza Arbab 2017-03-13 14:58 ` Reza Arbab 2017-03-14 19:35 ` Andrea Arcangeli 2017-03-14 19:35 ` Andrea Arcangeli 2017-03-14 19:35 ` Andrea Arcangeli 2017-03-15 7:57 ` Michal Hocko 2017-03-15 7:57 ` Michal Hocko 2017-03-15 7:57 ` Michal Hocko 2017-03-13 15:11 ` Michal Hocko 2017-03-13 15:11 ` Michal Hocko 2017-03-13 23:16 ` Andi Kleen 2017-03-13 23:16 ` Andi Kleen 2017-03-13 23:16 ` Andi Kleen 2017-03-13 23:16 ` Andi Kleen 2017-03-13 15:11 ` Michal Hocko 2017-03-10 17:39 ` WTH is going on with memory hotplug sysf interface Yasuaki Ishimatsu 2017-03-10 17:39 ` Yasuaki Ishimatsu 2017-03-13 9:19 ` Michal Hocko 2017-03-13 9:19 ` Michal Hocko 2017-03-14 16:05 ` YASUAKI ISHIMATSU 2017-03-14 16:05 ` YASUAKI ISHIMATSU 2017-03-14 16:05 ` YASUAKI ISHIMATSU 2017-03-14 16:20 ` Michal Hocko 2017-03-14 16:20 ` Michal Hocko 2017-03-14 16:20 ` Michal Hocko 2017-03-13 9:19 ` Michal Hocko 2017-03-10 17:39 ` Yasuaki Ishimatsu 2017-03-13 10:31 ` Igor Mammedov [this message] 2017-03-13 10:31 ` WTH is going on with memory hotplug sysf interface (was: Re: [RFC PATCH] mm, hotplug: get rid of auto_online_blocks) Igor Mammedov 2017-03-13 10:43 ` Michal Hocko 2017-03-13 10:43 ` Michal Hocko 2017-03-13 10:43 ` Michal Hocko 2017-03-13 13:57 ` Igor Mammedov 2017-03-13 13:57 ` Igor Mammedov 2017-03-13 14:36 ` Michal Hocko 2017-03-13 14:36 ` Michal Hocko 2017-03-13 14:36 ` Michal Hocko 2017-03-13 13:57 ` Igor Mammedov 2017-03-13 10:31 ` Igor Mammedov 2017-03-13 10:55 ` [RFC PATCH] mm, hotplug: get rid of auto_online_blocks Igor Mammedov 2017-03-13 10:55 ` Igor Mammedov 2017-03-13 10:55 ` Igor Mammedov 2017-03-13 12:28 ` Michal Hocko 2017-03-13 12:28 ` Michal Hocko 2017-03-13 12:28 ` Michal Hocko 2017-03-13 12:54 ` Vitaly Kuznetsov 2017-03-13 12:54 ` Vitaly Kuznetsov 2017-03-13 12:54 ` Vitaly Kuznetsov 2017-03-13 13:19 ` Michal Hocko 2017-03-13 13:19 ` Michal Hocko 2017-03-13 13:19 ` Michal Hocko 2017-03-13 13:42 ` Vitaly Kuznetsov 2017-03-13 13:42 ` Vitaly Kuznetsov 2017-03-13 14:32 ` Michal Hocko 2017-03-13 14:32 ` Michal Hocko 2017-03-13 15:10 ` Vitaly Kuznetsov 2017-03-13 15:10 ` Vitaly Kuznetsov 2017-03-13 15:10 ` Vitaly Kuznetsov 2017-03-13 14:32 ` Michal Hocko 2017-03-13 13:42 ` Vitaly Kuznetsov 2017-03-14 13:20 ` Igor Mammedov 2017-03-14 13:20 ` Igor Mammedov 2017-03-14 13:20 ` Igor Mammedov 2017-03-15 7:53 ` Michal Hocko 2017-03-15 7:53 ` Michal Hocko 2017-03-15 7:53 ` Michal Hocko 2017-03-07 12:40 ` Igor Mammedov 2017-03-10 22:00 ` Daniel Kiper 2017-03-10 22:00 ` Daniel Kiper 2017-03-10 22:00 ` Daniel Kiper 2017-03-03 17:34 ` Igor Mammedov 2017-03-03 8:27 ` Michal Hocko 2017-03-02 17:03 ` Igor Mammedov 2017-03-02 14:28 ` Michal Hocko 2017-02-27 17:28 ` Reza Arbab 2017-02-27 17:28 ` Reza Arbab 2017-02-27 17:28 ` Reza Arbab 2017-02-27 17:34 ` Michal Hocko 2017-02-27 17:34 ` Michal Hocko 2017-02-27 17:34 ` Michal Hocko
Reply instructions: You may reply publicly to this message via plain-text email using any one of the following methods: * Save the following mbox file, import it into your mail client, and reply-to-all from there: mbox Avoid top-posting and favor interleaved quoting: https://en.wikipedia.org/wiki/Posting_style#Interleaved_style * Reply using the --to, --cc, and --in-reply-to switches of git-send-email(1): git send-email \ --in-reply-to=20170313113110.6a9636a1@nial.brq.redhat.com \ --to=imammedo@redhat.com \ --cc=akpm@linux-foundation.org \ --cc=arbab@linux.vnet.ibm.com \ --cc=daniel.kiper@oracle.com \ --cc=gregkh@linuxfoundation.org \ --cc=heiko.carstens@de.ibm.com \ --cc=iamjoonsoo.kim@lge.com \ --cc=kys@microsoft.com \ --cc=linux-acpi@vger.kernel.org \ --cc=linux-api@vger.kernel.org \ --cc=linux-kernel@vger.kernel.org \ --cc=linux-mm@kvack.org \ --cc=linux-s390@vger.kernel.org \ --cc=mhocko@kernel.org \ --cc=qiuxishi@huawei.com \ --cc=rientjes@google.com \ --cc=slaoub@gmail.com \ --cc=tangchen@cn.fujitsu.com \ --cc=toshi.kani@hpe.com \ --cc=vbabka@suse.cz \ --cc=vkuznets@redhat.com \ --cc=xen-devel@lists.xenproject.org \ --cc=xieyisheng1@huawei.com \ --cc=yasu.isimatu@gmail.com \ --cc=zhenzhang.zhang@huawei.com \ /path/to/YOUR_REPLY https://kernel.org/pub/software/scm/git/docs/git-send-email.html * If your mail client supports setting the In-Reply-To header via mailto: links, try the mailto: linkBe sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.