All of lore.kernel.org
 help / color / mirror / Atom feed
From: Igor Mammedov <imammedo@redhat.com>
To: Michal Hocko <mhocko@kernel.org>
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>,
	Vitaly Kuznetsov <vkuznets@redhat.com>,
	linux-mm@kvack.org, Andrew Morton <akpm@linux-foundation.org>,
	Greg KH <gregkh@linuxfoundation.org>,
	"K. Y. Srinivasan" <kys@microsoft.com>,
	David Rientjes <rientjes@google.com>,
	Daniel Kiper <daniel.kiper@oracle.com>,
	linux-api@vger.kernel.org, LKML <linux-kernel@vger.kernel.org>,
	linux-s390@vger.kernel.org, xen-devel@lists.xenproject.org,
	linux-acpi@vger.kernel.org, qiuxishi@huawei.com,
	toshi.kani@hpe.com, xieyisheng1@huawei.com, slaoub@gmail.com,
	iamjoonsoo.kim@lge.com, vbabka@suse.cz,
	Zhang Zhen <zhenzhang.zhang@huawei.com>,
	Reza Arbab <arbab@linux.vnet.ibm.com>,
	Yasuaki Ishimatsu <yasu.isimatu@gmail.com>,
	Tang Chen <tangchen@cn.fujitsu.com>
Subject: Re: WTH is going on with memory hotplug sysf interface (was: Re: [RFC PATCH] mm, hotplug: get rid of auto_online_blocks)
Date: Mon, 13 Mar 2017 14:57:12 +0100	[thread overview]
Message-ID: <20170313145712.49a2d346@nial.brq.redhat.com> (raw)
In-Reply-To: <20170313104302.GK31518@dhcp22.suse.cz>

On Mon, 13 Mar 2017 11:43:02 +0100
Michal Hocko <mhocko@kernel.org> wrote:

> On Mon 13-03-17 11:31:10, Igor Mammedov wrote:
> > On Fri, 10 Mar 2017 14:58:07 +0100  
> [...]
> > > [    0.000000] ACPI: SRAT: Node 0 PXM 0 [mem 0x00000000-0x0009ffff]
> > > [    0.000000] ACPI: SRAT: Node 0 PXM 0 [mem 0x00100000-0x3fffffff]
> > > [    0.000000] ACPI: SRAT: Node 1 PXM 1 [mem 0x40000000-0x7fffffff]
> > > [    0.000000] ACPI: SRAT: Node 0 PXM 0 [mem 0x100000000-0x27fffffff] hotplug
> > > [    0.000000] NUMA: Node 0 [mem 0x00000000-0x0009ffff] + [mem 0x00100000-0x3fffffff] -> [mem 0x00000000-0x3fffffff]
> > > [    0.000000] NODE_DATA(0) allocated [mem 0x3fffc000-0x3fffffff]
> > > [    0.000000] NODE_DATA(1) allocated [mem 0x7ffdc000-0x7ffdffff]
> > > [    0.000000] Zone ranges:
> > > [    0.000000]   DMA      [mem 0x0000000000001000-0x0000000000ffffff]
> > > [    0.000000]   DMA32    [mem 0x0000000001000000-0x000000007ffdffff]
> > > [    0.000000]   Normal   empty
> > > [    0.000000] Movable zone start for each node
> > > [    0.000000] Early memory node ranges
> > > [    0.000000]   node   0: [mem 0x0000000000001000-0x000000000009efff]
> > > [    0.000000]   node   0: [mem 0x0000000000100000-0x000000003fffffff]
> > > [    0.000000]   node   1: [mem 0x0000000040000000-0x000000007ffdffff]
> > > 
> > > so there is neither any normal zone nor movable one at the boot time.  
> > it could be if hotpluggable memory were present at boot time in E802 table
> > (if I remember right when running on hyperv there is movable zone at boot time),
> > 
> > but in qemu hotpluggable memory isn't put into E820,
> > so zone is allocated later when memory is enumerated
> > by ACPI subsystem and onlined.
> > It causes less issues wrt movable zone and works for
> > different versions of linux/windows as well.
> > 
> > That's where in kernel auto-onlining could be also useful,
> > since user would be able to start-up with with small
> > non removable memory plus several removable DIMMs
> > and have all the memory onlined/available by the time
> > initrd is loaded. (missing piece here is onling
> > removable memory as movable by default).  
> 
> Why we should even care to online that memory that early rather than
> making it available via e820?

It's not forbidden by spec and has less complications
when it comes to removable memory. Declaring it in E820
would add following limitations/drawbacks:
 - firmware should be able to exclude removable memory
   from its usage (currently SeaBIOS nor EFI have to
   know/care about it) => less qemu-guest ABI to maintain.
 - OS should be taught to avoid/move (early) nonmovable
   allocations from removable address ranges.
   There were patches targeting that in recent kernels,
   but it won't work with older kernels that don't have it.
   So limiting a range of OSes that could run on QEMU
   and do memory removal.

E820 less approach works reasonably well with wide range
of guest OSes and less complex that if removable memory
were present it E820. Hence I don't have a compelling
reason to introduce removable memory in E820 as it
only adds to hot(un)plug issues.

I have an off-tree QEMU hack that puts hotremovable
memory added with "-device pc-dimm" on CLI into E820
to experiment with. It could be useful to play
with zone layouts at boot time, so if you are
interested I can rebase it on top of current master
and post it here to play with.

WARNING: multiple messages have this Message-ID (diff)
From: Igor Mammedov <imammedo@redhat.com>
To: Michal Hocko <mhocko@kernel.org>
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>,
	Vitaly Kuznetsov <vkuznets@redhat.com>,
	linux-mm@kvack.org, Andrew Morton <akpm@linux-foundation.org>,
	Greg KH <gregkh@linuxfoundation.org>,
	"K. Y. Srinivasan" <kys@microsoft.com>,
	David Rientjes <rientjes@google.com>,
	Daniel Kiper <daniel.kiper@oracle.com>,
	linux-api@vger.kernel.org, LKML <linux-kernel@vger.kernel.org>,
	linux-s390@vger.kernel.org, xen-devel@lists.xenproject.org,
	linux-acpi@vger.kernel.org, qiuxishi@huawei.com,
	toshi.kani@hpe.com, xieyisheng1@huawei.com, slaoub@gmail.com,
	iamjoonsoo.kim@lge.com, vbabka@suse.cz,
	Zhang Zhen <zhenzhang.zhang@huawei.com>,
	Reza Arbab <arbab@linux.vnet.ibm.com>,
	Yasuaki Ishimatsu <yasu.isimatu@gmail.com>,
	Tang Chen <tangchen@cn.fujitsu.com>
Subject: Re: WTH is going on with memory hotplug sysf interface (was: Re: [RFC PATCH] mm, hotplug: get rid of auto_online_blocks)
Date: Mon, 13 Mar 2017 14:57:12 +0100	[thread overview]
Message-ID: <20170313145712.49a2d346@nial.brq.redhat.com> (raw)
In-Reply-To: <20170313104302.GK31518@dhcp22.suse.cz>

On Mon, 13 Mar 2017 11:43:02 +0100
Michal Hocko <mhocko@kernel.org> wrote:

> On Mon 13-03-17 11:31:10, Igor Mammedov wrote:
> > On Fri, 10 Mar 2017 14:58:07 +0100  
> [...]
> > > [    0.000000] ACPI: SRAT: Node 0 PXM 0 [mem 0x00000000-0x0009ffff]
> > > [    0.000000] ACPI: SRAT: Node 0 PXM 0 [mem 0x00100000-0x3fffffff]
> > > [    0.000000] ACPI: SRAT: Node 1 PXM 1 [mem 0x40000000-0x7fffffff]
> > > [    0.000000] ACPI: SRAT: Node 0 PXM 0 [mem 0x100000000-0x27fffffff] hotplug
> > > [    0.000000] NUMA: Node 0 [mem 0x00000000-0x0009ffff] + [mem 0x00100000-0x3fffffff] -> [mem 0x00000000-0x3fffffff]
> > > [    0.000000] NODE_DATA(0) allocated [mem 0x3fffc000-0x3fffffff]
> > > [    0.000000] NODE_DATA(1) allocated [mem 0x7ffdc000-0x7ffdffff]
> > > [    0.000000] Zone ranges:
> > > [    0.000000]   DMA      [mem 0x0000000000001000-0x0000000000ffffff]
> > > [    0.000000]   DMA32    [mem 0x0000000001000000-0x000000007ffdffff]
> > > [    0.000000]   Normal   empty
> > > [    0.000000] Movable zone start for each node
> > > [    0.000000] Early memory node ranges
> > > [    0.000000]   node   0: [mem 0x0000000000001000-0x000000000009efff]
> > > [    0.000000]   node   0: [mem 0x0000000000100000-0x000000003fffffff]
> > > [    0.000000]   node   1: [mem 0x0000000040000000-0x000000007ffdffff]
> > > 
> > > so there is neither any normal zone nor movable one at the boot time.  
> > it could be if hotpluggable memory were present at boot time in E802 table
> > (if I remember right when running on hyperv there is movable zone at boot time),
> > 
> > but in qemu hotpluggable memory isn't put into E820,
> > so zone is allocated later when memory is enumerated
> > by ACPI subsystem and onlined.
> > It causes less issues wrt movable zone and works for
> > different versions of linux/windows as well.
> > 
> > That's where in kernel auto-onlining could be also useful,
> > since user would be able to start-up with with small
> > non removable memory plus several removable DIMMs
> > and have all the memory onlined/available by the time
> > initrd is loaded. (missing piece here is onling
> > removable memory as movable by default).  
> 
> Why we should even care to online that memory that early rather than
> making it available via e820?

It's not forbidden by spec and has less complications
when it comes to removable memory. Declaring it in E820
would add following limitations/drawbacks:
 - firmware should be able to exclude removable memory
   from its usage (currently SeaBIOS nor EFI have to
   know/care about it) => less qemu-guest ABI to maintain.
 - OS should be taught to avoid/move (early) nonmovable
   allocations from removable address ranges.
   There were patches targeting that in recent kernels,
   but it won't work with older kernels that don't have it.
   So limiting a range of OSes that could run on QEMU
   and do memory removal.

E820 less approach works reasonably well with wide range
of guest OSes and less complex that if removable memory
were present it E820. Hence I don't have a compelling
reason to introduce removable memory in E820 as it
only adds to hot(un)plug issues.

I have an off-tree QEMU hack that puts hotremovable
memory added with "-device pc-dimm" on CLI into E820
to experiment with. It could be useful to play
with zone layouts at boot time, so if you are
interested I can rebase it on top of current master
and post it here to play with.

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

  reply	other threads:[~2017-03-13 13:57 UTC|newest]

Thread overview: 142+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-02-27  9:28 [RFC PATCH] mm, hotplug: get rid of auto_online_blocks Michal Hocko
2017-02-27  9:28 ` Michal Hocko
2017-02-27 10:02 ` Vitaly Kuznetsov
2017-02-27 10:02 ` Vitaly Kuznetsov
2017-02-27 10:02   ` Vitaly Kuznetsov
2017-02-27 10:21   ` Michal Hocko
2017-02-27 10:21     ` Michal Hocko
2017-02-27 10:49     ` Vitaly Kuznetsov
2017-02-27 10:49       ` Vitaly Kuznetsov
2017-02-27 12:56       ` Michal Hocko
2017-02-27 12:56         ` Michal Hocko
2017-02-27 13:17         ` Vitaly Kuznetsov
     [not found]         ` <20170227125636.GB26504-2MMpYkNvuYDjFM9bn6wA6Q@public.gmane.org>
2017-02-27 13:17           ` Vitaly Kuznetsov
2017-02-27 13:17             ` Vitaly Kuznetsov
2017-02-27 13:17             ` Vitaly Kuznetsov
2017-02-27 12:56       ` Michal Hocko
2017-02-27 10:49     ` Vitaly Kuznetsov
2017-02-27 10:21   ` Michal Hocko
2017-02-27 11:25   ` Heiko Carstens
2017-02-27 11:25   ` Heiko Carstens
2017-02-27 11:25     ` Heiko Carstens
2017-02-27 11:50     ` Vitaly Kuznetsov
2017-02-27 11:50     ` Vitaly Kuznetsov
2017-02-27 11:50       ` Vitaly Kuznetsov
2017-02-27 15:43     ` Michal Hocko
2017-02-27 15:43     ` Michal Hocko
2017-02-27 15:43       ` Michal Hocko
2017-02-28 10:21       ` Heiko Carstens
2017-02-28 10:21       ` Heiko Carstens
2017-02-28 10:21         ` Heiko Carstens
2017-03-02 13:53       ` Igor Mammedov
2017-03-02 13:53       ` Igor Mammedov
2017-03-02 13:53         ` Igor Mammedov
2017-03-02 14:28         ` Michal Hocko
2017-03-02 14:28           ` Michal Hocko
2017-03-02 17:03           ` Igor Mammedov
2017-03-02 17:03             ` Igor Mammedov
2017-03-03  8:27             ` Michal Hocko
2017-03-03  8:27               ` Michal Hocko
2017-03-03 17:34               ` Igor Mammedov
2017-03-03 17:34                 ` Igor Mammedov
2017-03-06 14:54                 ` Michal Hocko
2017-03-06 14:54                 ` Michal Hocko
2017-03-06 14:54                   ` Michal Hocko
2017-03-07 12:40                   ` Igor Mammedov
2017-03-07 12:40                     ` Igor Mammedov
2017-03-09 12:54                     ` Michal Hocko
2017-03-09 12:54                     ` Michal Hocko
2017-03-09 12:54                       ` Michal Hocko
2017-03-10 13:58                       ` WTH is going on with memory hotplug sysf interface (was: Re: [RFC PATCH] mm, hotplug: get rid of auto_online_blocks) Michal Hocko
2017-03-10 13:58                       ` Michal Hocko
2017-03-10 13:58                         ` Michal Hocko
2017-03-10 15:53                         ` Michal Hocko
2017-03-10 15:53                         ` Michal Hocko
2017-03-10 15:53                           ` Michal Hocko
2017-03-10 19:00                           ` Reza Arbab
2017-03-10 19:00                           ` Reza Arbab
2017-03-10 19:00                             ` Reza Arbab
2017-03-13  9:21                             ` Michal Hocko
2017-03-13  9:21                             ` Michal Hocko
2017-03-13  9:21                               ` Michal Hocko
2017-03-13 14:58                               ` Reza Arbab
2017-03-13 14:58                               ` Reza Arbab
2017-03-13 14:58                                 ` Reza Arbab
2017-03-14 19:35                               ` Andrea Arcangeli
2017-03-14 19:35                               ` Andrea Arcangeli
2017-03-14 19:35                                 ` Andrea Arcangeli
2017-03-15  7:57                                 ` Michal Hocko
2017-03-15  7:57                                   ` Michal Hocko
2017-03-15  7:57                                 ` Michal Hocko
2017-03-13 15:11                           ` Michal Hocko
2017-03-13 15:11                             ` Michal Hocko
2017-03-13 23:16                             ` Andi Kleen
2017-03-13 23:16                             ` Andi Kleen
2017-03-13 23:16                               ` Andi Kleen
2017-03-13 23:16                               ` Andi Kleen
2017-03-13 15:11                           ` Michal Hocko
2017-03-10 17:39                         ` WTH is going on with memory hotplug sysf interface Yasuaki Ishimatsu
2017-03-10 17:39                           ` Yasuaki Ishimatsu
2017-03-13  9:19                           ` Michal Hocko
2017-03-13  9:19                             ` Michal Hocko
2017-03-14 16:05                             ` YASUAKI ISHIMATSU
2017-03-14 16:05                             ` YASUAKI ISHIMATSU
2017-03-14 16:05                               ` YASUAKI ISHIMATSU
2017-03-14 16:20                               ` Michal Hocko
2017-03-14 16:20                                 ` Michal Hocko
2017-03-14 16:20                               ` Michal Hocko
2017-03-13  9:19                           ` Michal Hocko
2017-03-10 17:39                         ` Yasuaki Ishimatsu
2017-03-13 10:31                         ` WTH is going on with memory hotplug sysf interface (was: Re: [RFC PATCH] mm, hotplug: get rid of auto_online_blocks) Igor Mammedov
2017-03-13 10:31                           ` Igor Mammedov
2017-03-13 10:43                           ` Michal Hocko
2017-03-13 10:43                           ` Michal Hocko
2017-03-13 10:43                             ` Michal Hocko
2017-03-13 13:57                             ` Igor Mammedov [this message]
2017-03-13 13:57                               ` Igor Mammedov
2017-03-13 14:36                               ` Michal Hocko
2017-03-13 14:36                               ` Michal Hocko
2017-03-13 14:36                                 ` Michal Hocko
2017-03-13 13:57                             ` Igor Mammedov
2017-03-13 10:31                         ` Igor Mammedov
2017-03-13 10:55                       ` [RFC PATCH] mm, hotplug: get rid of auto_online_blocks Igor Mammedov
2017-03-13 10:55                       ` Igor Mammedov
2017-03-13 10:55                         ` Igor Mammedov
2017-03-13 12:28                         ` Michal Hocko
2017-03-13 12:28                         ` Michal Hocko
2017-03-13 12:28                           ` Michal Hocko
2017-03-13 12:54                           ` Vitaly Kuznetsov
2017-03-13 12:54                           ` Vitaly Kuznetsov
2017-03-13 12:54                             ` Vitaly Kuznetsov
2017-03-13 13:19                             ` Michal Hocko
2017-03-13 13:19                             ` Michal Hocko
2017-03-13 13:19                               ` Michal Hocko
2017-03-13 13:42                               ` Vitaly Kuznetsov
2017-03-13 13:42                                 ` Vitaly Kuznetsov
2017-03-13 14:32                                 ` Michal Hocko
2017-03-13 14:32                                   ` Michal Hocko
2017-03-13 15:10                                   ` Vitaly Kuznetsov
2017-03-13 15:10                                     ` Vitaly Kuznetsov
2017-03-13 15:10                                   ` Vitaly Kuznetsov
2017-03-13 14:32                                 ` Michal Hocko
2017-03-13 13:42                               ` Vitaly Kuznetsov
2017-03-14 13:20                           ` Igor Mammedov
2017-03-14 13:20                           ` Igor Mammedov
2017-03-14 13:20                             ` Igor Mammedov
2017-03-15  7:53                             ` Michal Hocko
2017-03-15  7:53                               ` Michal Hocko
2017-03-15  7:53                             ` Michal Hocko
2017-03-07 12:40                   ` Igor Mammedov
2017-03-10 22:00                   ` Daniel Kiper
2017-03-10 22:00                     ` Daniel Kiper
2017-03-10 22:00                   ` Daniel Kiper
2017-03-03 17:34               ` Igor Mammedov
2017-03-03  8:27             ` Michal Hocko
2017-03-02 17:03           ` Igor Mammedov
2017-03-02 14:28         ` Michal Hocko
2017-02-27 17:28 ` Reza Arbab
2017-02-27 17:28 ` Reza Arbab
2017-02-27 17:28   ` Reza Arbab
2017-02-27 17:34   ` Michal Hocko
2017-02-27 17:34   ` Michal Hocko
2017-02-27 17:34     ` Michal Hocko

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20170313145712.49a2d346@nial.brq.redhat.com \
    --to=imammedo@redhat.com \
    --cc=akpm@linux-foundation.org \
    --cc=arbab@linux.vnet.ibm.com \
    --cc=daniel.kiper@oracle.com \
    --cc=gregkh@linuxfoundation.org \
    --cc=heiko.carstens@de.ibm.com \
    --cc=iamjoonsoo.kim@lge.com \
    --cc=kys@microsoft.com \
    --cc=linux-acpi@vger.kernel.org \
    --cc=linux-api@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=linux-s390@vger.kernel.org \
    --cc=mhocko@kernel.org \
    --cc=qiuxishi@huawei.com \
    --cc=rientjes@google.com \
    --cc=slaoub@gmail.com \
    --cc=tangchen@cn.fujitsu.com \
    --cc=toshi.kani@hpe.com \
    --cc=vbabka@suse.cz \
    --cc=vkuznets@redhat.com \
    --cc=xen-devel@lists.xenproject.org \
    --cc=xieyisheng1@huawei.com \
    --cc=yasu.isimatu@gmail.com \
    --cc=zhenzhang.zhang@huawei.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.