Linux-api Archive on lore.kernel.org
 help / color / Atom feed
From: Vlastimil Babka <vbabka@suse.cz>
To: Michal Hocko <mhocko@kernel.org>
Cc: linux-mm@kvack.org, Andrew Morton <akpm@linux-foundation.org>,
	Mel Gorman <mgorman@suse.de>,
	Andrea Arcangeli <aarcange@redhat.com>,
	Reza Arbab <arbab@linux.vnet.ibm.com>,
	Yasuaki Ishimatsu <yasu.isimatu@gmail.com>,
	qiuxishi@huawei.com, Kani Toshimitsu <toshi.kani@hpe.com>,
	slaoub@gmail.com, Joonsoo Kim <js1304@gmail.com>,
	Daniel Kiper <daniel.kiper@oracle.com>,
	Igor Mammedov <imammedo@redhat.com>,
	Vitaly Kuznetsov <vkuznets@redhat.com>,
	Wei Yang <richard.weiyang@gmail.com>,
	LKML <linux-kernel@vger.kernel.org>,
	Linux API <linux-api@vger.kernel.org>
Subject: Re: [PATCH 2/2] mm, memory_hotplug: remove zone restrictions
Date: Mon, 10 Jul 2017 13:11:29 +0200
Message-ID: <24c3606d-837a-266d-a294-7e100d1430f0@suse.cz> (raw)
In-Reply-To: <20170710064540.GA19185@dhcp22.suse.cz>

On 07/10/2017 08:45 AM, Michal Hocko wrote:
> On Fri 07-07-17 17:02:59, Vlastimil Babka wrote:
>> [+CC linux-api]
>>
>> On 06/29/2017 09:35 AM, Michal Hocko wrote:
>>> From: Michal Hocko <mhocko@suse.com>
>>>
>>> Historically we have enforced that any kernel zone (e.g ZONE_NORMAL) has
>>> to precede the Movable zone in the physical memory range. The purpose of
>>> the movable zone is, however, not bound to any physical memory restriction.
>>> It merely defines a class of migrateable and reclaimable memory.
>>>
>>> There are users (e.g. CMA) who might want to reserve specific physical
>>> memory ranges for their own purpose. Moreover our pfn walkers have to be
>>> prepared for zones overlapping in the physical range already because we
>>> do support interleaving NUMA nodes and therefore zones can interleave as
>>> well. This means we can allow each memory block to be associated with a
>>> different zone.
>>>
>>> Loosen the current onlining semantic and allow explicit onlining type on
>>> any memblock. That means that online_{kernel,movable} will be allowed
>>> regardless of the physical address of the memblock as long as it is
>>> offline of course. This might result in moveble zone overlapping with
>>> other kernel zones. Default onlining then becomes a bit tricky but still
>>> sensible. echo online > memoryXY/state will online the given block to
>>> 	1) the default zone if the given range is outside of any zone
>>> 	2) the enclosing zone if such a zone doesn't interleave with
>>> 	   any other zone
>>>         3) the default zone if more zones interleave for this range
>>> where default zone is movable zone only if movable_node is enabled
>>> otherwise it is a kernel zone.
>>>
>>> Here is an example of the semantic with (movable_node is not present but
>>> it work in an analogous way). We start with following memblocks, all of
>>> them offline
>>> memory34/valid_zones:Normal Movable
>>> memory35/valid_zones:Normal Movable
>>> memory36/valid_zones:Normal Movable
>>> memory37/valid_zones:Normal Movable
>>> memory38/valid_zones:Normal Movable
>>> memory39/valid_zones:Normal Movable
>>> memory40/valid_zones:Normal Movable
>>> memory41/valid_zones:Normal Movable
>>>
>>> Now, we online block 34 in default mode and block 37 as movable
>>> root@test1:/sys/devices/system/node/node1# echo online > memory34/state
>>> root@test1:/sys/devices/system/node/node1# echo online_movable > memory37/state
>>> memory34/valid_zones:Normal
>>> memory35/valid_zones:Normal Movable
>>> memory36/valid_zones:Normal Movable
>>> memory37/valid_zones:Movable
>>> memory38/valid_zones:Normal Movable
>>> memory39/valid_zones:Normal Movable
>>> memory40/valid_zones:Normal Movable
>>> memory41/valid_zones:Normal Movable
>>
>> Hm so previously, blocks 37-41 would only allow Movable at this point, right?
> 
> yes
> 
>> Shouldn't we still default to Movable for them? We might be breaking some
>> existing userspace here.
> 
> I do not think so. Prior to this merge window f1dd2cd13c4b ("mm,
> memory_hotplug: do not associate hotadded memory to zones until online")
> we allowed only the last offline or the adjacent to existing movable
> memory block to be onlined movable. So the above wasn't possible.

Not exactly the above, but let's say 1-34 is onlined as Normal, 35-37 is
Movable. Then the only possible action before would be online 38 as
Movable? Now it defaults to Normal?

> I
> doubt we have grown a new user since the rework has been merged but if
> you think we should make sure nothing like that happens then we should
> probably merge this patch in this release cycle.

If I'm right and this is a change compared to pre-rework, then it
doesn't matter.

>> IMHO onlining new memory past existing blocks is more common use case than
>> onlining memory between two blocks that are already online?
> 
> I am not really sure. It is quite common to online and offline within an
> existing zones for the memory ballooning. I do not know what kind of
> online operation they use but using the default online operation has
> historically preserved the zone so I would be really reluctant to change
> that.

Hmm all right, ballooning...

>> I also agree with Wei Yang that it's rather fuzzy that a zone that has been
>> completely offlined will affect the defaults for the next onlining just because
>> it has some spanned range, which is however empty of actual populated memory.
> 
> I am sorry but I still do not see why. The zone is not empty. It has a
> range spanned. It just doesn't have any pages online. I really fail to
> see how that is different from zones with large offline holes.
> 
>> Maybe it would simplest for everyone to just default to Normal, except
>> movable_node? That's if we decide that the potential breakage I
>> described above is a non-issue.
> 
> This would break the usecase where the memory is onlined a certain type
> initially and the offline/online it later on demand for ballooning.
> 
> I wish this could be more clear but the default onlining has been fuzzy
> since the movable online has been introduced and it is hard to buil
> something really clear since then. The proposed semantic is the most
> clean I could come up with but I am open to any suggestions that
> wouldn't break existing usage.

OK I can live with the semantics, if we clear question of breaking
existing users.

  reply index

Thread overview: 9+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
     [not found] <20170629073509.623-1-mhocko@kernel.org>
     [not found] ` <20170629073509.623-3-mhocko@kernel.org>
2017-07-07 15:02   ` Vlastimil Babka
2017-07-10  6:45     ` Michal Hocko
2017-07-10 11:11       ` Vlastimil Babka [this message]
2017-07-10 11:17         ` Michal Hocko
2017-07-10 12:12           ` Vlastimil Babka
2017-07-10 12:30             ` Michal Hocko
2017-07-14 12:12 [PATCH 0/2] mm, memory_hotplug: remove zone onlining restriction Michal Hocko
2017-07-14 12:12 ` [PATCH 2/2] mm, memory_hotplug: remove zone restrictions Michal Hocko
2017-07-14 12:17   ` Vlastimil Babka
2017-07-14 14:26   ` Reza Arbab

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=24c3606d-837a-266d-a294-7e100d1430f0@suse.cz \
    --to=vbabka@suse.cz \
    --cc=aarcange@redhat.com \
    --cc=akpm@linux-foundation.org \
    --cc=arbab@linux.vnet.ibm.com \
    --cc=daniel.kiper@oracle.com \
    --cc=imammedo@redhat.com \
    --cc=js1304@gmail.com \
    --cc=linux-api@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=mgorman@suse.de \
    --cc=mhocko@kernel.org \
    --cc=qiuxishi@huawei.com \
    --cc=richard.weiyang@gmail.com \
    --cc=slaoub@gmail.com \
    --cc=toshi.kani@hpe.com \
    --cc=vkuznets@redhat.com \
    --cc=yasu.isimatu@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Linux-api Archive on lore.kernel.org

Archives are clonable:
	git clone --mirror https://lore.kernel.org/linux-api/0 linux-api/git/0.git

	# If you have public-inbox 1.1+ installed, you may
	# initialize and index your mirror using the following commands:
	public-inbox-init -V2 linux-api linux-api/ https://lore.kernel.org/linux-api \
		linux-api@vger.kernel.org
	public-inbox-index linux-api

Example config snippet for mirrors

Newsgroup available over NNTP:
	nntp://nntp.lore.kernel.org/org.kernel.vger.linux-api


AGPL code for this site: git clone https://public-inbox.org/public-inbox.git