linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Wei Wang <wei.w.wang@intel.com>
To: Michal Hocko <mhocko@kernel.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>,
	virtio-dev@lists.oasis-open.org,
	Linux Kernel Mailing List <linux-kernel@vger.kernel.org>,
	virtualization <virtualization@lists.linux-foundation.org>,
	KVM list <kvm@vger.kernel.org>, linux-mm <linux-mm@kvack.org>,
	"Michael S. Tsirkin" <mst@redhat.com>,
	Andrew Morton <akpm@linux-foundation.org>,
	Paolo Bonzini <pbonzini@redhat.com>,
	liliang.opensource@gmail.com, yang.zhang.wz@gmail.com,
	quan.xu0@gmail.com, nilal@redhat.com,
	Rik van Riel <riel@redhat.com>,
	peterx@redhat.com
Subject: Re: [PATCH v35 1/5] mm: support to get hints of free page blocks
Date: Fri, 13 Jul 2018 08:33:27 +0800	[thread overview]
Message-ID: <5B47F357.7020202@intel.com> (raw)
In-Reply-To: <20180712114946.GI32648@dhcp22.suse.cz>

On 07/12/2018 07:49 PM, Michal Hocko wrote:
> On Thu 12-07-18 19:34:16, Wei Wang wrote:
>> On 07/12/2018 04:13 PM, Michal Hocko wrote:
>>> On Thu 12-07-18 10:52:08, Wei Wang wrote:
>>>> On 07/12/2018 10:30 AM, Linus Torvalds wrote:
>>>>> On Wed, Jul 11, 2018 at 7:17 PM Wei Wang <wei.w.wang@intel.com> wrote:
>>>>>> Would it be better to remove __GFP_THISNODE? We actually want to get all
>>>>>> the guest free pages (from all the nodes).
>>>>> Maybe. Or maybe it would be better to have the memory balloon logic be
>>>>> per-node? Maybe you don't want to remove too much memory from one
>>>>> node? I think it's one of those "play with it" things.
>>>>>
>>>>> I don't think that's the big issue, actually. I think the real issue
>>>>> is how to react quickly and gracefully to "oops, I'm trying to give
>>>>> memory away, but now the guest wants it back" while you're in the
>>>>> middle of trying to create that 2TB list of pages.
>>>> OK. virtio-balloon has already registered an oom notifier
>>>> (virtballoon_oom_notify). I plan to add some control there. If oom happens,
>>>> - stop the page allocation;
>>>> - immediately give back the allocated pages to mm.
>>> Please don't. Oom notifier is an absolutely hideous interface which
>>> should go away sooner or later (I would much rather like the former) so
>>> do not build a new logic on top of it. I would appreciate if you
>>> actually remove the notifier much more.
>>>
>>> You can give memory back from the standard shrinker interface. If we are
>>> reaching low reclaim priorities then we are struggling to reclaim memory
>>> and then you can start returning pages back.
>> OK. Just curious why oom notifier is thought to be hideous, and has it been
>> a consensus?
> Because it is a completely non-transparent callout from the OOM context
> which is really subtle on its own. It is just too easy to end up in
> weird corner cases. We really have to be careful and be as swift as
> possible. Any potential sleep would make the OOM situation much worse
> because nobody would be able to make a forward progress or (in)direct
> dependency on MM subsystem can easily deadlock. Those are really hard
> to track down and defining the notifier as blockable by design which
> just asks for bad implementations because most people simply do not
> realize how subtle the oom context is.
>
> Another thing is that it happens way too late when we have basically
> reclaimed the world and didn't get out of the memory pressure so you can
> expect any workload is suffering already. Anybody sitting on a large
> amount of reclaimable memory should have released that memory by that
> time. Proportionally to the reclaim pressure ideally.
>
> The notifier API is completely unaware of oom constrains. Just imagine
> you are OOM in a subset of numa nodes. Callback doesn't have any idea
> about that.
>
> Moreover we do have proper reclaim mechanism that has a feedback
> loop and that should be always preferable to an abrupt reclaim.

Sounds very reasonable, thanks for the elaboration. I'll try with shrinker.

Best,
Wei




  reply	other threads:[~2018-07-13  0:29 UTC|newest]

Thread overview: 27+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-07-10  9:31 [PATCH v35 0/5] Virtio-balloon: support free page reporting Wei Wang
2018-07-10  9:31 ` [PATCH v35 1/5] mm: support to get hints of free page blocks Wei Wang
2018-07-10 10:16   ` Wang, Wei W
2018-07-10 17:33   ` Linus Torvalds
2018-07-11  1:28     ` Wei Wang
2018-07-11  1:44       ` Linus Torvalds
2018-07-11  9:21         ` Michal Hocko
2018-07-11 10:52           ` Wei Wang
2018-07-11 11:09             ` Michal Hocko
2018-07-11 13:55               ` Wang, Wei W
2018-07-11 14:38                 ` Michal Hocko
2018-07-11 19:36               ` Michael S. Tsirkin
2018-07-11 16:23           ` Linus Torvalds
2018-07-12  2:21             ` Wei Wang
2018-07-12  2:30               ` Linus Torvalds
2018-07-12  2:52                 ` Wei Wang
2018-07-12  8:13                   ` Michal Hocko
2018-07-12 11:34                     ` Wei Wang
2018-07-12 11:49                       ` Michal Hocko
2018-07-13  0:33                         ` Wei Wang [this message]
2018-07-12 13:12             ` Michal Hocko
2018-07-11  4:00     ` Michael S. Tsirkin
2018-07-11  4:04       ` Michael S. Tsirkin
2018-07-10  9:31 ` [PATCH v35 2/5] virtio-balloon: remove BUG() in init_vqs Wei Wang
2018-07-10  9:31 ` [PATCH v35 3/5] virtio-balloon: VIRTIO_BALLOON_F_FREE_PAGE_HINT Wei Wang
2018-07-10  9:31 ` [PATCH v35 4/5] mm/page_poison: expose page_poisoning_enabled to kernel modules Wei Wang
2018-07-10  9:31 ` [PATCH v35 5/5] virtio-balloon: VIRTIO_BALLOON_F_PAGE_POISON Wei Wang

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=5B47F357.7020202@intel.com \
    --to=wei.w.wang@intel.com \
    --cc=akpm@linux-foundation.org \
    --cc=kvm@vger.kernel.org \
    --cc=liliang.opensource@gmail.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=mhocko@kernel.org \
    --cc=mst@redhat.com \
    --cc=nilal@redhat.com \
    --cc=pbonzini@redhat.com \
    --cc=peterx@redhat.com \
    --cc=quan.xu0@gmail.com \
    --cc=riel@redhat.com \
    --cc=torvalds@linux-foundation.org \
    --cc=virtio-dev@lists.oasis-open.org \
    --cc=virtualization@lists.linux-foundation.org \
    --cc=yang.zhang.wz@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).