From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-5.4 required=3.0 tests=DKIMWL_WL_MED,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS,HTML_MESSAGE, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,USER_IN_DEF_DKIM_WL autolearn=no autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 4542AC11D00 for ; Fri, 21 Feb 2020 03:30:07 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id DFCD120716 for ; Fri, 21 Feb 2020 03:30:06 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b="E5kI2417" DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org DFCD120716 Authentication-Results: mail.kernel.org; dmarc=fail (p=reject dis=none) header.from=google.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id 891C56B000A; Thu, 20 Feb 2020 22:30:06 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id 842476B009B; Thu, 20 Feb 2020 22:30:06 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 731416B009C; Thu, 20 Feb 2020 22:30:06 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0156.hostedemail.com [216.40.44.156]) by kanga.kvack.org (Postfix) with ESMTP id 5AF186B000A for ; Thu, 20 Feb 2020 22:30:06 -0500 (EST) Received: from smtpin10.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay01.hostedemail.com (Postfix) with ESMTP id 3F9A7180AD81D for ; Fri, 21 Feb 2020 03:30:06 +0000 (UTC) X-FDA: 76512705612.10.pie56_7540dbb5d791a X-HE-Tag: pie56_7540dbb5d791a X-Filterd-Recvd-Size: 12274 Received: from mail-vs1-f65.google.com (mail-vs1-f65.google.com [209.85.217.65]) by imf17.hostedemail.com (Postfix) with ESMTP for ; Fri, 21 Feb 2020 03:30:05 +0000 (UTC) Received: by mail-vs1-f65.google.com with SMTP id a2so396354vso.3 for ; Thu, 20 Feb 2020 19:30:05 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20161025; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc; bh=pUDd9XEKZ0mMI9kJjN+Q5EMiKEZY7NvZJReDi9zkSkM=; b=E5kI2417HAzfj75ssogxx/xIva9tL75P/5TfezibHr1sOw9h4xD6npTlFbWIDscalj YtacR3LLJtnGlAOBYf1kF41tne+MNi+hMfXElguVOFN5hWHtlXl1Tq813DH9OosUrJiF Tq9LChAivRSLn5lKtqq6Rrk+cH4i3VoGROMH9NLH1H9sd+NRkf5fk2xdsQHC5Y2TGZ+S JLmyFhVKX8WunUR5AQu/s5YVvf/VGOUr2kSmD4svkHPSftPvYn3A7JPX81e9wsLLGgNU LnQGhLy0n262yXtEUKZ99sOnqXmvAsuL7lccnAEoIneiJOiXBdiT+dNo24RLQu0vmfxI jjWQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=pUDd9XEKZ0mMI9kJjN+Q5EMiKEZY7NvZJReDi9zkSkM=; b=VtjMJhhYd9WfxL9ZzW3wbPaFuboixtFiHM1pTSMsX4vfWWFRbyms7ZxPVOq9U3RPY2 JOMW8cXgxFUBVH8U8wJW78Gj+uBsZaA2V9RRu8aMl2lnERM4uKd6U0V+TN6NAamybK30 ks1q5cpG4rwakawtEaU5PaH170Li2KFTSRnHSqffFkDUgzMG8jWuq5F1r+8o5i794rh8 UaDzX3sV8qEsAFmUaHClPjGCL54IDyHyhILIhNhdww4LVINElpmXQhx1jb1nx7cxErgm Vp3keSorUlfsyjd1jBjHouTiaubioqooTmNDyDauWSl91c2prJL4UucxGLeJGsWhfBHe xbww== X-Gm-Message-State: APjAAAUQD5umHhhKhRAFZQPbLDWPk1BJNeb0+YWhMlmc6bHN5iF7pTNv na9To12rokOTDKeCf+MjxIg0Ws2jtzc98e5xzCRQXQ== X-Google-Smtp-Source: APXvYqxmQCVmjJmoaB1i5MoCxlxObJQitys1HNzGNElVqIRR7PYnMh5hV+ZFJ1RAswVKjojX/rRu0XjeWIFN7rBxpkQ= X-Received: by 2002:a67:df97:: with SMTP id x23mr19145618vsk.160.1582255804568; Thu, 20 Feb 2020 19:30:04 -0800 (PST) MIME-Version: 1.0 References: <20200205163402.42627-1-david@redhat.com> <20200205163402.42627-4-david@redhat.com> <20200216044641-mutt-send-email-mst@kernel.org> In-Reply-To: <20200216044641-mutt-send-email-mst@kernel.org> From: Tyler Sanderson Date: Thu, 20 Feb 2020 19:29:53 -0800 Message-ID: Subject: Re: [PATCH v1 3/3] virtio-balloon: Switch back to OOM handler for VIRTIO_BALLOON_F_DEFLATE_ON_OOM To: "Michael S. Tsirkin" Cc: David Hildenbrand , linux-kernel@vger.kernel.org, linux-mm@kvack.org, virtualization@lists.linux-foundation.org, Wei Wang , Alexander Duyck , David Rientjes , Nadav Amit , Michal Hocko Content-Type: multipart/alternative; boundary="0000000000002d6743059f0da34b" X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: --0000000000002d6743059f0da34b Content-Type: text/plain; charset="UTF-8" Testing this patch is on my short-term TODO list, but I wasn't able to get to it this week. It is prioritized. In the meantime, I can anecdotally vouch that kernels before 4.19, the ones using the OOM notifier callback, have roughly 10x faster balloon inflation when pressuring the cache. So I anticipate this patch will return to that state and help my use case. I will try to post official measurements of this patch next week. On Sun, Feb 16, 2020 at 1:47 AM Michael S. Tsirkin wrote: > On Fri, Feb 14, 2020 at 10:51:43AM +0100, David Hildenbrand wrote: > > On 05.02.20 17:34, David Hildenbrand wrote: > > > Commit 71994620bb25 ("virtio_balloon: replace oom notifier with > shrinker") > > > changed the behavior when deflation happens automatically. Instead of > > > deflating when called by the OOM handler, the shrinker is used. > > > > > > However, the balloon is not simply some slab cache that should be > > > shrunk when under memory pressure. The shrinker does not have a > concept of > > > priorities, so this behavior cannot be configured. > > > > > > There was a report that this results in undesired side effects when > > > inflating the balloon to shrink the page cache. [1] > > > "When inflating the balloon against page cache (i.e. no free memory > > > remains) vmscan.c will both shrink page cache, but also invoke the > > > shrinkers -- including the balloon's shrinker. So the balloon > > > driver allocates memory which requires reclaim, vmscan gets this > > > memory by shrinking the balloon, and then the driver adds the > > > memory back to the balloon. Basically a busy no-op." > > > > > > The name "deflate on OOM" makes it pretty clear when deflation should > > > happen - after other approaches to reclaim memory failed, not while > > > reclaiming. This allows to minimize the footprint of a guest - memory > > > will only be taken out of the balloon when really needed. > > > > > > Especially, a drop_slab() will result in the whole balloon getting > > > deflated - undesired. While handling it via the OOM handler might not > be > > > perfect, it keeps existing behavior. If we want a different behavior, > then > > > we need a new feature bit and document it properly (although, there > should > > > be a clear use case and the intended effects should be well described). > > > > > > Keep using the shrinker for VIRTIO_BALLOON_F_FREE_PAGE_HINT, because > > > this has no such side effects. Always register the shrinker with > > > VIRTIO_BALLOON_F_FREE_PAGE_HINT now. We are always allowed to reuse > free > > > pages that are still to be processed by the guest. The hypervisor takes > > > care of identifying and resolving possible races between processing a > > > hinting request and the guest reusing a page. > > > > > > In contrast to pre commit 71994620bb25 ("virtio_balloon: replace oom > > > notifier with shrinker"), don't add a moodule parameter to configure > the > > > number of pages to deflate on OOM. Can be re-added if really needed. > > > Also, pay attention that leak_balloon() returns the number of 4k pages > - > > > convert it properly in virtio_balloon_oom_notify(). > > > > > > Note1: using the OOM handler is frowned upon, but it really is what we > > > need for this feature. > > > > > > Note2: without VIRTIO_BALLOON_F_MUST_TELL_HOST (iow, always with QEMU) > we > > > could actually skip sending deflation requests to our > hypervisor, > > > making the OOM path *very* simple. Besically freeing pages and > > > updating the balloon. If the communication with the host ever > > > becomes a problem on this call path. > > > > > > > @Michael, how to proceed with this? > > > > I'd like to see some reports that this helps people. > e.g. a tested-by tag. > > > -- > > Thanks, > > > > David / dhildenb > > --0000000000002d6743059f0da34b Content-Type: text/html; charset="UTF-8" Content-Transfer-Encoding: quoted-printable
Testing this patch is on my short-term TODO list, but I wa= sn't able to get to it this week. It is prioritized.

In the meantime, I can anecdotally vouch that kernels before 4.19, the one= s using the OOM notifier callback, have roughly 10x faster balloon inflatio= n when pressuring the cache. So I anticipate this patch will return to that= state and help my use case.

I will try to post of= ficial measurements of this patch next week.

On Sun, Feb 16, 2020 at 1= :47 AM Michael S. Tsirkin <mst@redhat.= com> wrote:
On Fri, Feb 14, 2020 at 10:51:43AM +0100, David Hildenbrand wrote:
> On 05.02.20 17:34, David Hildenbrand wrote:
> > Commit 71994620bb25 ("virtio_balloon: replace oom notifier w= ith shrinker")
> > changed the behavior when deflation happens automatically. Instea= d of
> > deflating when called by the OOM handler, the shrinker is used. > >
> > However, the balloon is not simply some slab cache that should be=
> > shrunk when under memory pressure. The shrinker does not have a c= oncept of
> > priorities, so this behavior cannot be configured.
> >
> > There was a report that this results in undesired side effects wh= en
> > inflating the balloon to shrink the page cache. [1]
> >=C2=A0 =C2=A0 =C2=A0"When inflating the balloon against page = cache (i.e. no free memory
> >=C2=A0 =C2=A0 =C2=A0 remains) vmscan.c will both shrink page cache= , but also invoke the
> >=C2=A0 =C2=A0 =C2=A0 shrinkers -- including the balloon's shri= nker. So the balloon
> >=C2=A0 =C2=A0 =C2=A0 driver allocates memory which requires reclai= m, vmscan gets this
> >=C2=A0 =C2=A0 =C2=A0 memory by shrinking the balloon, and then the= driver adds the
> >=C2=A0 =C2=A0 =C2=A0 memory back to the balloon. Basically a busy = no-op."
> >
> > The name "deflate on OOM" makes it pretty clear when de= flation should
> > happen - after other approaches to reclaim memory failed, not whi= le
> > reclaiming. This allows to minimize the footprint of a guest - me= mory
> > will only be taken out of the balloon when really needed.
> >
> > Especially, a drop_slab() will result in the whole balloon gettin= g
> > deflated - undesired. While handling it via the OOM handler might= not be
> > perfect, it keeps existing behavior. If we want a different behav= ior, then
> > we need a new feature bit and document it properly (although, the= re should
> > be a clear use case and the intended effects should be well descr= ibed).
> >
> > Keep using the shrinker for VIRTIO_BALLOON_F_FREE_PAGE_HINT, beca= use
> > this has no such side effects. Always register the shrinker with<= br> > > VIRTIO_BALLOON_F_FREE_PAGE_HINT now. We are always allowed to reu= se free
> > pages that are still to be processed by the guest. The hypervisor= takes
> > care of identifying and resolving possible races between processi= ng a
> > hinting request and the guest reusing a page.
> >
> > In contrast to pre commit 71994620bb25 ("virtio_balloon: rep= lace oom
> > notifier with shrinker"), don't add a moodule parameter = to configure the
> > number of pages to deflate on OOM. Can be re-added if really need= ed.
> > Also, pay attention that leak_balloon() returns the number of 4k = pages -
> > convert it properly in virtio_balloon_oom_notify().
> >
> > Note1: using the OOM handler is frowned upon, but it really is wh= at we
> >=C2=A0 =C2=A0 =C2=A0 =C2=A0 need for this feature.
> >
> > Note2: without VIRTIO_BALLOON_F_MUST_TELL_HOST (iow, always with = QEMU) we
> >=C2=A0 =C2=A0 =C2=A0 =C2=A0 could actually skip sending deflation = requests to our hypervisor,
> >=C2=A0 =C2=A0 =C2=A0 =C2=A0 making the OOM path *very* simple. Bes= ically freeing pages and
> >=C2=A0 =C2=A0 =C2=A0 =C2=A0 updating the balloon. If the communica= tion with the host ever
> >=C2=A0 =C2=A0 =C2=A0 =C2=A0 becomes a problem on this call path. > >
>
> @Michael, how to proceed with this?
>

I'd like to see some reports that this helps people.
e.g. a tested-by tag.

> --
> Thanks,
>
> David / dhildenb

--0000000000002d6743059f0da34b--