From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <SRS0=6BjR=3Z=kvack.org=owner-linux-mm@kernel.org>
X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on
	aws-us-west-2-korg-lkml-1.web.codeaurora.org
X-Spam-Level: 
X-Spam-Status: No, score=-8.4 required=3.0 tests=DKIMWL_WL_MED,DKIM_SIGNED,
	DKIM_VALID,DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS,HTML_MESSAGE,
	MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,USER_IN_DEF_DKIM_WL autolearn=no
	autolearn_force=no version=3.4.0
Received: from mail.kernel.org (mail.kernel.org [198.145.29.99])
	by smtp.lore.kernel.org (Postfix) with ESMTP id 28EE7C35247
	for <linux-mm@archiver.kernel.org>; Wed,  5 Feb 2020 00:16:08 +0000 (UTC)
Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17])
	by mail.kernel.org (Postfix) with ESMTP id C68DC2082E
	for <linux-mm@archiver.kernel.org>; Wed,  5 Feb 2020 00:16:07 +0000 (UTC)
Authentication-Results: mail.kernel.org;
	dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b="RFmrWc0U"
DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org C68DC2082E
Authentication-Results: mail.kernel.org; dmarc=fail (p=reject dis=none) header.from=google.com
Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org
Received: by kanga.kvack.org (Postfix)
	id 5611C6B0007; Tue,  4 Feb 2020 19:16:07 -0500 (EST)
Received: by kanga.kvack.org (Postfix, from userid 40)
	id 4EA626B000C; Tue,  4 Feb 2020 19:16:07 -0500 (EST)
X-Delivered-To: int-list-linux-mm@kvack.org
Received: by kanga.kvack.org (Postfix, from userid 63042)
	id 3B2056B000D; Tue,  4 Feb 2020 19:16:07 -0500 (EST)
X-Delivered-To: linux-mm@kvack.org
Received: from forelay.hostedemail.com (smtprelay0130.hostedemail.com [216.40.44.130])
	by kanga.kvack.org (Postfix) with ESMTP id 1C2876B0007
	for <linux-mm@kvack.org>; Tue,  4 Feb 2020 19:16:07 -0500 (EST)
Received: from smtpin10.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251])
	by forelay05.hostedemail.com (Postfix) with ESMTP id 921EC181AEF00
	for <linux-mm@kvack.org>; Wed,  5 Feb 2020 00:16:06 +0000 (UTC)
X-FDA: 76454155932.10.pie87_6f72df4d9d503
X-HE-Tag: pie87_6f72df4d9d503
X-Filterd-Recvd-Size: 15294
Received: from mail-vs1-f45.google.com (mail-vs1-f45.google.com [209.85.217.45])
	by imf33.hostedemail.com (Postfix) with ESMTP
	for <linux-mm@kvack.org>; Wed,  5 Feb 2020 00:16:05 +0000 (UTC)
Received: by mail-vs1-f45.google.com with SMTP id g15so240294vsf.1
        for <linux-mm@kvack.org>; Tue, 04 Feb 2020 16:16:05 -0800 (PST)
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
        d=google.com; s=20161025;
        h=mime-version:references:in-reply-to:from:date:message-id:subject:to
         :cc;
        bh=cGdCfLjRtDPXydcBWCp03hXWh5cpE7/XZ7Wz5DbtW14=;
        b=RFmrWc0U/dIAJtt5R/8dM7UgmOH85IdeAtBXWbC/zNcqG8LUeI4Npil5deg0kFOjbq
         RVtHNlAgzrp2RS0BIywygFp6FM3oM4kXfn0apGLkxQDeVr4n/Ep2ZA5FFKJuGr7Mg2ep
         VA3EPK/SDYhhc1thLIQeEvxRSHkFCdskCtI1j3GbSvJ7VPxRwERKA5xFZ9+eDrBqffwK
         byEwRZNtdWK/AD/clK4eUqw+hUN4vZZNnakb3nHrWd9ZMT3VrnySNX7Cm54PreEBmXAE
         VF/iPdA5CyfejvYwDG5SiF36r6KiZRcOx39cXhwL5+I/HmPb/lmwRjQf/cazHbnrHFRV
         Gtdg==
X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed;
        d=1e100.net; s=20161025;
        h=x-gm-message-state:mime-version:references:in-reply-to:from:date
         :message-id:subject:to:cc;
        bh=cGdCfLjRtDPXydcBWCp03hXWh5cpE7/XZ7Wz5DbtW14=;
        b=o4U+RDNlOEpYHdUY0WHZ1xpl3k69Row6K4Q+iAlD9sQIx0EOxIK7xm6btAsyM2mvDf
         xmJM/qqzQ54fpKo+K4QqUGbNncpMrjBraruKcK0GofqfhQrz13Tuw1Y1whw6/CtDZPyM
         g7FGqIpxVPUzEZBBg+ZC1dVV3AjkEGvWDdqRZt7H/0WkspA1aieVzTJoZGE60tOMFasP
         5dW91NEALPjMCCVECgCY95WmtBiz7lf0ooxd1lEoxDFtl74zb554oxAyBOFyKtjMUzm2
         PrItpo5RTjhBMgjuPl4FEcfEgwnkWdVS9rR2AiwkGnyzdtzCjar8zdH/FkreV2ZM7Nq8
         TeXA==
X-Gm-Message-State: APjAAAV/w2u8acmCuapcTqFuLLCRhn9i870rslmqq7LlAuGkE/zfDEj4
	OuQ1JTDSs1p1r4cEGSATRynVEhpahQAdZVA0TZNy4eR9
X-Google-Smtp-Source: APXvYqzqd/nogo6x9tGOdPka5NWogfh1P/DbNOd+nF3DsA8pEZ/NaHAeyHMuOTz0Sa7OAdTcDaYQXpq5t8VuRkR4hQQ=
X-Received: by 2002:a05:6102:72b:: with SMTP id u11mr20251693vsg.69.1580861765002;
 Tue, 04 Feb 2020 16:16:05 -0800 (PST)
MIME-Version: 1.0
References: <CAJuQAmpDUyve2S+oxp9tLUhuRcnddXnNztC5PmYOOCpY6c68xg@mail.gmail.com>
 <91270a68-ff48-88b0-219c-69801f0c252f@redhat.com> <CAJuQAmoaK0Swytu2Os_SQRfG5_LqiCPaDa9yatatm9MtfncNTQ@mail.gmail.com>
 <75d4594f-0864-5172-a0f8-f97affedb366@redhat.com> <286AC319A985734F985F78AFA26841F73E3F8A02@shsmsx102.ccr.corp.intel.com>
 <CAJuQAmqcayaNuG19fKCuux=YVO3+VcN-qrXvobgKMykogeMkzA@mail.gmail.com>
 <20200203080520-mutt-send-email-mst@kernel.org> <5ac131de8e3b7fc1fafd05a61feb5f6889aeb917.camel@linux.intel.com>
 <c836a8d1-c5cc-eb8b-84ed-027070b77bf8@redhat.com> <20200203120225-mutt-send-email-mst@kernel.org>
 <CAJuQAmqGA9mhzR5AQeMDtovJAh7y8khC3qUtLKx_e9RdL0wFJQ@mail.gmail.com>
 <74cc25a6-cefb-c580-8e59-5b76fb680bf4@redhat.com> <CAJuQAmpiVqnNt-vSkQh5Gg63QZ49_nuz4+VW2Jfwn51gWVdtfA@mail.gmail.com>
 <b809340d-7e86-caf6-bf12-db7bb8265045@redhat.com> <CAJuQAmqeKvc_k7pmDuC1b+w6yezzHoSxZJ8WW5sHVo1yMsRPfg@mail.gmail.com>
In-Reply-To: <CAJuQAmqeKvc_k7pmDuC1b+w6yezzHoSxZJ8WW5sHVo1yMsRPfg@mail.gmail.com>
From: Tyler Sanderson <tysand@google.com>
Date: Tue, 4 Feb 2020 16:15:53 -0800
Message-ID: <CAJuQAmpzP3V8p002UYCGyTGkMQ=B1B_=o-4y=jxv2LPkbADdAw@mail.gmail.com>
Subject: Re: Balloon pressuring page cache
To: David Hildenbrand <david@redhat.com>
Cc: "Michael S. Tsirkin" <mst@redhat.com>, Alexander Duyck <alexander.h.duyck@linux.intel.com>, 
	"Wang, Wei W" <wei.w.wang@intel.com>, 
	"virtualization@lists.linux-foundation.org" <virtualization@lists.linux-foundation.org>, 
	David Rientjes <rientjes@google.com>, "linux-mm@kvack.org" <linux-mm@kvack.org>, 
	Michal Hocko <mhocko@kernel.org>, namit@vmware.com
Content-Type: multipart/alternative; boundary="000000000000f193dd059dc90f7b"
X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4
Sender: owner-linux-mm@kvack.org
Precedence: bulk
X-Loop: owner-majordomo@kvack.org
List-ID: <linux-mm.kvack.org>

--000000000000f193dd059dc90f7b
Content-Type: text/plain; charset="UTF-8"

On Tue, Feb 4, 2020 at 3:58 PM Tyler Sanderson <tysand@google.com> wrote:

>
>
> On Tue, Feb 4, 2020 at 11:17 AM David Hildenbrand <david@redhat.com>
> wrote:
>
>> On 04.02.20 19:52, Tyler Sanderson wrote:
>> >
>> >
>> > On Tue, Feb 4, 2020 at 12:29 AM David Hildenbrand <david@redhat.com
>> > <mailto:david@redhat.com>> wrote:
>> >
>> >     On 03.02.20 21:32, Tyler Sanderson wrote:
>> >     > There were apparently good reasons for moving away from OOM
>> notifier
>> >     > callback:
>> >     > https://lkml.org/lkml/2018/7/12/314
>> >     > https://lkml.org/lkml/2018/8/2/322
>> >     >
>> >     > In particular the OOM notifier is worse than the shrinker because:
>> >
>> >     The issue is that DEFLATE_ON_OOM is under-specified.
>> >
>> >     >
>> >     >  1. It is last-resort, which means the system has already gone
>> through
>> >     >     heroics to prevent OOM. Those heroic reclaim efforts are
>> expensive
>> >     >     and impact application performance.
>> >
>> >     That's *exactly* what "deflate on OOM" suggests.
>> >
>> >
>> > It seems there are some use cases where "deflate on OOM" is desired and
>> > others where "deflate on pressure" is desired.
>> > This suggests adding a new feature bit "DEFLATE_ON_PRESSURE" that
>> > registers the shrinker, and reverting DEFLATE_ON_OOM to use the OOM
>> > notifier callback.
>> >
>> > This lets users configure the balloon for their use case.
>>
>> You want the old behavior back, so why should we introduce a new one? Or
>> am I missing something? (you did want us to revert to old handling, no?)
>>
> Reverting actually doesn't help me because this has been the behavior
> since Linux 4.19 which is already widely in use. So my device
> implementation needs to handle the shrinker behavior anyways. I started
> this conversation to ask what the intended device implementation was.
>
I should clarify: reverting _would_ improve guest performance under my
implementation. So I guess I'm in favor. But I think we should consider
reasonable alternative implementations. I think this suggests adding a new
feature bit to allow device implementations to choose.


> I think there are reasonable device implementations that would prefer the
> shrinker behavior (it turns out that mine doesn't).
> For example, an implementation that slowly inflates the balloon for the
> purpose of memory overcommit. It might leave the balloon inflated and
> expect any memory pressure (including page cache usage) to deflate the
> balloon as a way to dynamically right-size the balloon.
>
> Two reasons I didn't go with the above implementation:
> 1. I need to support guests before Linux 4.19 which don't have the
> shrinker behavior.
> 2. Memory in the balloon does not appear as "available" in /proc/meminfo
> even though it is freeable. This is confusing to users, but isn't a deal
> breaker.
>
> If we added a DEFLATE_ON_PRESSURE feature bit that indicated shrinker API
> support then that would resolve reason #1 (ideally we would backport the
> bit to 4.19).
>
> In any case, the shrinker behavior when pressuring page cache is more of
> an inefficiency than a bug. It's not clear to me that it necessitates
> reverting. If there were/are reasons to be on the shrinker interface then I
> think those carry similar weight as the problem itself.
>
>
>>
>> I consider virtio-balloon to this very day a big hack. And I don't see
>> it getting better with new config knobs. Having that said, the
>> technologies that are candidates to replace it (free page reporting,
>> taming the guest page cache, etc.) are still not ready - so we'll have
>> to stick with it for now :( .
>>
>> >
>> > I'm actually not sure how you would safely do memory overcommit without
>> > DEFLATE_ON_OOM. So I think it unlocks a huge use case.
>>
>> Using better suited technologies that are not ready yet (well, some form
>> of free page reporting is available under IBM z already but in a
>> proprietary form) ;) Anyhow, I remember that DEFLATE_ON_OOM only makes
>> it less likely to crash your guest, but not that you are safe to squeeze
>> the last bit out of your guest VM.
>>
> Can you elaborate on the danger of DEFLATE_ON_OOM? I haven't seen any
> problems in testing but I'd really like to know about the dangers.
> Is there a difference in safety between the OOM notifier callback and the
> shrinker API?
>
>
>>
>> --
>> Thanks,
>>
>> David / dhildenb
>>
>>

--000000000000f193dd059dc90f7b
Content-Type: text/html; charset="UTF-8"
Content-Transfer-Encoding: quoted-printable

<div dir=3D"ltr"><div dir=3D"ltr"><br></div><br><div class=3D"gmail_quote">=
<div dir=3D"ltr" class=3D"gmail_attr">On Tue, Feb 4, 2020 at 3:58 PM Tyler =
Sanderson &lt;<a href=3D"mailto:tysand@google.com">tysand@google.com</a>&gt=
; wrote:<br></div><blockquote class=3D"gmail_quote" style=3D"margin:0px 0px=
 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex"><div di=
r=3D"ltr"><div dir=3D"ltr"><br></div><br><div class=3D"gmail_quote"><div di=
r=3D"ltr" class=3D"gmail_attr">On Tue, Feb 4, 2020 at 11:17 AM David Hilden=
brand &lt;<a href=3D"mailto:david@redhat.com" target=3D"_blank">david@redha=
t.com</a>&gt; wrote:<br></div><blockquote class=3D"gmail_quote" style=3D"ma=
rgin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:=
1ex">On 04.02.20 19:52, Tyler Sanderson wrote:<br>
&gt; <br>
&gt; <br>
&gt; On Tue, Feb 4, 2020 at 12:29 AM David Hildenbrand &lt;<a href=3D"mailt=
o:david@redhat.com" target=3D"_blank">david@redhat.com</a><br>
&gt; &lt;mailto:<a href=3D"mailto:david@redhat.com" target=3D"_blank">david=
@redhat.com</a>&gt;&gt; wrote:<br>
&gt; <br>
&gt;=C2=A0 =C2=A0 =C2=A0On 03.02.20 21:32, Tyler Sanderson wrote:<br>
&gt;=C2=A0 =C2=A0 =C2=A0&gt; There were apparently good reasons for moving =
away from OOM notifier<br>
&gt;=C2=A0 =C2=A0 =C2=A0&gt; callback:<br>
&gt;=C2=A0 =C2=A0 =C2=A0&gt; <a href=3D"https://lkml.org/lkml/2018/7/12/314=
" rel=3D"noreferrer" target=3D"_blank">https://lkml.org/lkml/2018/7/12/314<=
/a><br>
&gt;=C2=A0 =C2=A0 =C2=A0&gt; <a href=3D"https://lkml.org/lkml/2018/8/2/322"=
 rel=3D"noreferrer" target=3D"_blank">https://lkml.org/lkml/2018/8/2/322</a=
><br>
&gt;=C2=A0 =C2=A0 =C2=A0&gt;<br>
&gt;=C2=A0 =C2=A0 =C2=A0&gt; In particular the OOM notifier is worse than t=
he shrinker because:<br>
&gt; <br>
&gt;=C2=A0 =C2=A0 =C2=A0The issue is that DEFLATE_ON_OOM is under-specified=
.<br>
&gt; <br>
&gt;=C2=A0 =C2=A0 =C2=A0&gt;<br>
&gt;=C2=A0 =C2=A0 =C2=A0&gt;=C2=A0 1. It is last-resort, which means the sy=
stem has already gone through<br>
&gt;=C2=A0 =C2=A0 =C2=A0&gt;=C2=A0 =C2=A0 =C2=A0heroics to prevent OOM. Tho=
se heroic reclaim efforts are expensive<br>
&gt;=C2=A0 =C2=A0 =C2=A0&gt;=C2=A0 =C2=A0 =C2=A0and impact application perf=
ormance.<br>
&gt; <br>
&gt;=C2=A0 =C2=A0 =C2=A0That&#39;s *exactly* what &quot;deflate on OOM&quot=
; suggests.<br>
&gt; <br>
&gt; <br>
&gt; It seems there are some use cases where &quot;deflate on OOM&quot; is =
desired and<br>
&gt; others where &quot;deflate on pressure&quot; is desired.<br>
&gt; This suggests adding a new feature bit &quot;DEFLATE_ON_PRESSURE&quot;=
 that<br>
&gt; registers the shrinker, and reverting DEFLATE_ON_OOM to use the OOM<br=
>
&gt; notifier callback.<br>
&gt; <br>
&gt; This lets users configure the balloon for their use case.<br>
<br>
You want the old behavior back, so why should we introduce a new one? Or<br=
>
am I missing something? (you did want us to revert to old handling, no?)<br=
></blockquote><div>Reverting actually doesn&#39;t help me because this has =
been the behavior since Linux 4.19 which is already widely in use. So my de=
vice implementation needs to handle the shrinker behavior anyways. I starte=
d this conversation to ask what the intended device implementation was.</di=
v></div></div></blockquote><div>I should clarify: reverting _would_ improve=
 guest=C2=A0performance under my implementation. So I guess I&#39;m in favo=
r. But I think we should consider reasonable alternative implementations. I=
 think this suggests adding a new feature bit to allow device implementatio=
ns to choose.</div><div><br></div><blockquote class=3D"gmail_quote" style=
=3D"margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding=
-left:1ex"><div dir=3D"ltr"><div class=3D"gmail_quote"><div><br></div><div>=
I think there are reasonable device implementations that would prefer the s=
hrinker=C2=A0behavior (it turns out that mine doesn&#39;t).<br></div><div>F=
or example, an implementation that slowly inflates the balloon for the purp=
ose of memory overcommit. It might leave the balloon inflated and expect an=
y memory pressure (including page cache usage) to deflate the balloon as a =
way to dynamically right-size the balloon.</div><div><br></div><div>Two rea=
sons I didn&#39;t go with the above implementation:</div><div>1. I need to =
support guests before Linux 4.19 which don&#39;t have the shrinker behavior=
.</div><div>2. Memory in the balloon does not appear as &quot;available&quo=
t; in /proc/meminfo even though it is freeable. This is confusing to users,=
 but isn&#39;t a deal breaker.</div><div><div><br></div><div>If we added a =
DEFLATE_ON_PRESSURE feature bit that indicated shrinker API support then th=
at would resolve reason=C2=A0#1 (ideally we would backport the bit to 4.19)=
.</div><div><br></div><div>In any case, the shrinker=C2=A0behavior when pre=
ssuring page cache is more of an inefficiency than a bug. It&#39;s not clea=
r to me that it necessitates reverting. If there were/are reasons to be on =
the shrinker=C2=A0interface then I think those carry similar weight as the =
problem itself.</div></div><div>=C2=A0<br></div><blockquote class=3D"gmail_=
quote" style=3D"margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,=
204);padding-left:1ex">
<br>
I consider virtio-balloon to this very day a big hack. And I don&#39;t see<=
br>
it getting better with new config knobs. Having that said, the<br>
technologies that are candidates to replace it (free page reporting,<br>
taming the guest page cache, etc.) are still not ready - so we&#39;ll have<=
br>
to stick with it for now :( .<br>
<br>
&gt; <br>
&gt; I&#39;m actually not sure how you would safely do memory overcommit wi=
thout<br>
&gt; DEFLATE_ON_OOM. So I think it unlocks a huge use case.<br>
<br>
Using better suited technologies that are not ready yet (well, some form<br=
>
of free page reporting is available under IBM z already but in a<br>
proprietary form) ;) Anyhow, I remember that DEFLATE_ON_OOM only makes<br>
it less likely to crash your guest, but not that you are safe to squeeze<br=
>
the last bit out of your guest VM.<br></blockquote><div>Can you elaborate o=
n the danger of DEFLATE_ON_OOM? I haven&#39;t seen any problems in testing =
but I&#39;d really like to know about the dangers.</div><div>Is there a dif=
ference in safety between the OOM notifier callback and the shrinker API?</=
div><div>=C2=A0<br></div><blockquote class=3D"gmail_quote" style=3D"margin:=
0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">
<br>
-- <br>
Thanks,<br>
<br>
David / dhildenb<br>
<br>
</blockquote></div></div>
</blockquote></div></div>

--000000000000f193dd059dc90f7b--


From mboxrd@z Thu Jan  1 00:00:00 1970
From: Tyler Sanderson via Virtualization
 <virtualization@lists.linux-foundation.org>
Subject: Re: Balloon pressuring page cache
Date: Tue, 4 Feb 2020 16:15:53 -0800
Message-ID: <CAJuQAmpzP3V8p002UYCGyTGkMQ=B1B_=o-4y=jxv2LPkbADdAw@mail.gmail.com>
References: <CAJuQAmpDUyve2S+oxp9tLUhuRcnddXnNztC5PmYOOCpY6c68xg@mail.gmail.com>
 <91270a68-ff48-88b0-219c-69801f0c252f@redhat.com>
 <CAJuQAmoaK0Swytu2Os_SQRfG5_LqiCPaDa9yatatm9MtfncNTQ@mail.gmail.com>
 <75d4594f-0864-5172-a0f8-f97affedb366@redhat.com>
 <286AC319A985734F985F78AFA26841F73E3F8A02@shsmsx102.ccr.corp.intel.com>
 <CAJuQAmqcayaNuG19fKCuux=YVO3+VcN-qrXvobgKMykogeMkzA@mail.gmail.com>
 <20200203080520-mutt-send-email-mst@kernel.org>
 <5ac131de8e3b7fc1fafd05a61feb5f6889aeb917.camel@linux.intel.com>
 <c836a8d1-c5cc-eb8b-84ed-027070b77bf8@redhat.com>
 <20200203120225-mutt-send-email-mst@kernel.org>
 <CAJuQAmqGA9mhzR5AQeMDtovJAh7y8khC3qUtLKx_e9RdL0wFJQ@mail.gmail.com>
 <74cc25a6-cefb-c580-8e59-5b76fb680bf4@redhat.com>
 <CAJuQAmpiVqnNt-vSkQh5Gg63QZ49_nuz4+VW2Jfwn51gWVdtfA@mail.gmail.com>
 <b809340d-7e86-caf6-bf12-db7bb8265045@redhat.com>
 <CAJuQAmqeKvc_k7pmDuC1b+w6yezzHoSxZJ8WW5sHVo1yMsRPfg@mail.gmail.com>
Reply-To: Tyler Sanderson <tysand@google.com>
Mime-Version: 1.0
Content-Type: multipart/mixed; boundary="===============5999778019002158300=="
Return-path: <virtualization-bounces@lists.linux-foundation.org>
In-Reply-To: <CAJuQAmqeKvc_k7pmDuC1b+w6yezzHoSxZJ8WW5sHVo1yMsRPfg@mail.gmail.com>
List-Unsubscribe: <https://lists.linuxfoundation.org/mailman/options/virtualization>,
 <mailto:virtualization-request@lists.linux-foundation.org?subject=unsubscribe>
List-Archive: <http://lists.linuxfoundation.org/pipermail/virtualization/>
List-Post: <mailto:virtualization@lists.linux-foundation.org>
List-Help: <mailto:virtualization-request@lists.linux-foundation.org?subject=help>
List-Subscribe: <https://lists.linuxfoundation.org/mailman/listinfo/virtualization>,
 <mailto:virtualization-request@lists.linux-foundation.org?subject=subscribe>
Errors-To: virtualization-bounces@lists.linux-foundation.org
Sender: "Virtualization" <virtualization-bounces@lists.linux-foundation.org>
To: David Hildenbrand <david@redhat.com>
Cc: "Michael S. Tsirkin" <mst@redhat.com>, "virtualization@lists.linux-foundation.org" <virtualization@lists.linux-foundation.org>, "linux-mm@kvack.org" <linux-mm@kvack.org>, namit@vmware.com, David Rientjes <rientjes@google.com>, Alexander Duyck <alexander.h.duyck@linux.intel.com>, Michal Hocko <mhocko@kernel.org>
List-Id: virtualization@lists.linuxfoundation.org

--===============5999778019002158300==
Content-Type: multipart/alternative; boundary="000000000000f193dd059dc90f7b"

--000000000000f193dd059dc90f7b
Content-Type: text/plain; charset="UTF-8"

On Tue, Feb 4, 2020 at 3:58 PM Tyler Sanderson <tysand@google.com> wrote:

>
>
> On Tue, Feb 4, 2020 at 11:17 AM David Hildenbrand <david@redhat.com>
> wrote:
>
>> On 04.02.20 19:52, Tyler Sanderson wrote:
>> >
>> >
>> > On Tue, Feb 4, 2020 at 12:29 AM David Hildenbrand <david@redhat.com
>> > <mailto:david@redhat.com>> wrote:
>> >
>> >     On 03.02.20 21:32, Tyler Sanderson wrote:
>> >     > There were apparently good reasons for moving away from OOM
>> notifier
>> >     > callback:
>> >     > https://lkml.org/lkml/2018/7/12/314
>> >     > https://lkml.org/lkml/2018/8/2/322
>> >     >
>> >     > In particular the OOM notifier is worse than the shrinker because:
>> >
>> >     The issue is that DEFLATE_ON_OOM is under-specified.
>> >
>> >     >
>> >     >  1. It is last-resort, which means the system has already gone
>> through
>> >     >     heroics to prevent OOM. Those heroic reclaim efforts are
>> expensive
>> >     >     and impact application performance.
>> >
>> >     That's *exactly* what "deflate on OOM" suggests.
>> >
>> >
>> > It seems there are some use cases where "deflate on OOM" is desired and
>> > others where "deflate on pressure" is desired.
>> > This suggests adding a new feature bit "DEFLATE_ON_PRESSURE" that
>> > registers the shrinker, and reverting DEFLATE_ON_OOM to use the OOM
>> > notifier callback.
>> >
>> > This lets users configure the balloon for their use case.
>>
>> You want the old behavior back, so why should we introduce a new one? Or
>> am I missing something? (you did want us to revert to old handling, no?)
>>
> Reverting actually doesn't help me because this has been the behavior
> since Linux 4.19 which is already widely in use. So my device
> implementation needs to handle the shrinker behavior anyways. I started
> this conversation to ask what the intended device implementation was.
>
I should clarify: reverting _would_ improve guest performance under my
implementation. So I guess I'm in favor. But I think we should consider
reasonable alternative implementations. I think this suggests adding a new
feature bit to allow device implementations to choose.


> I think there are reasonable device implementations that would prefer the
> shrinker behavior (it turns out that mine doesn't).
> For example, an implementation that slowly inflates the balloon for the
> purpose of memory overcommit. It might leave the balloon inflated and
> expect any memory pressure (including page cache usage) to deflate the
> balloon as a way to dynamically right-size the balloon.
>
> Two reasons I didn't go with the above implementation:
> 1. I need to support guests before Linux 4.19 which don't have the
> shrinker behavior.
> 2. Memory in the balloon does not appear as "available" in /proc/meminfo
> even though it is freeable. This is confusing to users, but isn't a deal
> breaker.
>
> If we added a DEFLATE_ON_PRESSURE feature bit that indicated shrinker API
> support then that would resolve reason #1 (ideally we would backport the
> bit to 4.19).
>
> In any case, the shrinker behavior when pressuring page cache is more of
> an inefficiency than a bug. It's not clear to me that it necessitates
> reverting. If there were/are reasons to be on the shrinker interface then I
> think those carry similar weight as the problem itself.
>
>
>>
>> I consider virtio-balloon to this very day a big hack. And I don't see
>> it getting better with new config knobs. Having that said, the
>> technologies that are candidates to replace it (free page reporting,
>> taming the guest page cache, etc.) are still not ready - so we'll have
>> to stick with it for now :( .
>>
>> >
>> > I'm actually not sure how you would safely do memory overcommit without
>> > DEFLATE_ON_OOM. So I think it unlocks a huge use case.
>>
>> Using better suited technologies that are not ready yet (well, some form
>> of free page reporting is available under IBM z already but in a
>> proprietary form) ;) Anyhow, I remember that DEFLATE_ON_OOM only makes
>> it less likely to crash your guest, but not that you are safe to squeeze
>> the last bit out of your guest VM.
>>
> Can you elaborate on the danger of DEFLATE_ON_OOM? I haven't seen any
> problems in testing but I'd really like to know about the dangers.
> Is there a difference in safety between the OOM notifier callback and the
> shrinker API?
>
>
>>
>> --
>> Thanks,
>>
>> David / dhildenb
>>
>>

--000000000000f193dd059dc90f7b
Content-Type: text/html; charset="UTF-8"
Content-Transfer-Encoding: quoted-printable

<div dir=3D"ltr"><div dir=3D"ltr"><br></div><br><div class=3D"gmail_quote">=
<div dir=3D"ltr" class=3D"gmail_attr">On Tue, Feb 4, 2020 at 3:58 PM Tyler =
Sanderson &lt;<a href=3D"mailto:tysand@google.com">tysand@google.com</a>&gt=
; wrote:<br></div><blockquote class=3D"gmail_quote" style=3D"margin:0px 0px=
 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex"><div di=
r=3D"ltr"><div dir=3D"ltr"><br></div><br><div class=3D"gmail_quote"><div di=
r=3D"ltr" class=3D"gmail_attr">On Tue, Feb 4, 2020 at 11:17 AM David Hilden=
brand &lt;<a href=3D"mailto:david@redhat.com" target=3D"_blank">david@redha=
t.com</a>&gt; wrote:<br></div><blockquote class=3D"gmail_quote" style=3D"ma=
rgin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:=
1ex">On 04.02.20 19:52, Tyler Sanderson wrote:<br>
&gt; <br>
&gt; <br>
&gt; On Tue, Feb 4, 2020 at 12:29 AM David Hildenbrand &lt;<a href=3D"mailt=
o:david@redhat.com" target=3D"_blank">david@redhat.com</a><br>
&gt; &lt;mailto:<a href=3D"mailto:david@redhat.com" target=3D"_blank">david=
@redhat.com</a>&gt;&gt; wrote:<br>
&gt; <br>
&gt;=C2=A0 =C2=A0 =C2=A0On 03.02.20 21:32, Tyler Sanderson wrote:<br>
&gt;=C2=A0 =C2=A0 =C2=A0&gt; There were apparently good reasons for moving =
away from OOM notifier<br>
&gt;=C2=A0 =C2=A0 =C2=A0&gt; callback:<br>
&gt;=C2=A0 =C2=A0 =C2=A0&gt; <a href=3D"https://lkml.org/lkml/2018/7/12/314=
" rel=3D"noreferrer" target=3D"_blank">https://lkml.org/lkml/2018/7/12/314<=
/a><br>
&gt;=C2=A0 =C2=A0 =C2=A0&gt; <a href=3D"https://lkml.org/lkml/2018/8/2/322"=
 rel=3D"noreferrer" target=3D"_blank">https://lkml.org/lkml/2018/8/2/322</a=
><br>
&gt;=C2=A0 =C2=A0 =C2=A0&gt;<br>
&gt;=C2=A0 =C2=A0 =C2=A0&gt; In particular the OOM notifier is worse than t=
he shrinker because:<br>
&gt; <br>
&gt;=C2=A0 =C2=A0 =C2=A0The issue is that DEFLATE_ON_OOM is under-specified=
.<br>
&gt; <br>
&gt;=C2=A0 =C2=A0 =C2=A0&gt;<br>
&gt;=C2=A0 =C2=A0 =C2=A0&gt;=C2=A0 1. It is last-resort, which means the sy=
stem has already gone through<br>
&gt;=C2=A0 =C2=A0 =C2=A0&gt;=C2=A0 =C2=A0 =C2=A0heroics to prevent OOM. Tho=
se heroic reclaim efforts are expensive<br>
&gt;=C2=A0 =C2=A0 =C2=A0&gt;=C2=A0 =C2=A0 =C2=A0and impact application perf=
ormance.<br>
&gt; <br>
&gt;=C2=A0 =C2=A0 =C2=A0That&#39;s *exactly* what &quot;deflate on OOM&quot=
; suggests.<br>
&gt; <br>
&gt; <br>
&gt; It seems there are some use cases where &quot;deflate on OOM&quot; is =
desired and<br>
&gt; others where &quot;deflate on pressure&quot; is desired.<br>
&gt; This suggests adding a new feature bit &quot;DEFLATE_ON_PRESSURE&quot;=
 that<br>
&gt; registers the shrinker, and reverting DEFLATE_ON_OOM to use the OOM<br=
>
&gt; notifier callback.<br>
&gt; <br>
&gt; This lets users configure the balloon for their use case.<br>
<br>
You want the old behavior back, so why should we introduce a new one? Or<br=
>
am I missing something? (you did want us to revert to old handling, no?)<br=
></blockquote><div>Reverting actually doesn&#39;t help me because this has =
been the behavior since Linux 4.19 which is already widely in use. So my de=
vice implementation needs to handle the shrinker behavior anyways. I starte=
d this conversation to ask what the intended device implementation was.</di=
v></div></div></blockquote><div>I should clarify: reverting _would_ improve=
 guest=C2=A0performance under my implementation. So I guess I&#39;m in favo=
r. But I think we should consider reasonable alternative implementations. I=
 think this suggests adding a new feature bit to allow device implementatio=
ns to choose.</div><div><br></div><blockquote class=3D"gmail_quote" style=
=3D"margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding=
-left:1ex"><div dir=3D"ltr"><div class=3D"gmail_quote"><div><br></div><div>=
I think there are reasonable device implementations that would prefer the s=
hrinker=C2=A0behavior (it turns out that mine doesn&#39;t).<br></div><div>F=
or example, an implementation that slowly inflates the balloon for the purp=
ose of memory overcommit. It might leave the balloon inflated and expect an=
y memory pressure (including page cache usage) to deflate the balloon as a =
way to dynamically right-size the balloon.</div><div><br></div><div>Two rea=
sons I didn&#39;t go with the above implementation:</div><div>1. I need to =
support guests before Linux 4.19 which don&#39;t have the shrinker behavior=
.</div><div>2. Memory in the balloon does not appear as &quot;available&quo=
t; in /proc/meminfo even though it is freeable. This is confusing to users,=
 but isn&#39;t a deal breaker.</div><div><div><br></div><div>If we added a =
DEFLATE_ON_PRESSURE feature bit that indicated shrinker API support then th=
at would resolve reason=C2=A0#1 (ideally we would backport the bit to 4.19)=
.</div><div><br></div><div>In any case, the shrinker=C2=A0behavior when pre=
ssuring page cache is more of an inefficiency than a bug. It&#39;s not clea=
r to me that it necessitates reverting. If there were/are reasons to be on =
the shrinker=C2=A0interface then I think those carry similar weight as the =
problem itself.</div></div><div>=C2=A0<br></div><blockquote class=3D"gmail_=
quote" style=3D"margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,=
204);padding-left:1ex">
<br>
I consider virtio-balloon to this very day a big hack. And I don&#39;t see<=
br>
it getting better with new config knobs. Having that said, the<br>
technologies that are candidates to replace it (free page reporting,<br>
taming the guest page cache, etc.) are still not ready - so we&#39;ll have<=
br>
to stick with it for now :( .<br>
<br>
&gt; <br>
&gt; I&#39;m actually not sure how you would safely do memory overcommit wi=
thout<br>
&gt; DEFLATE_ON_OOM. So I think it unlocks a huge use case.<br>
<br>
Using better suited technologies that are not ready yet (well, some form<br=
>
of free page reporting is available under IBM z already but in a<br>
proprietary form) ;) Anyhow, I remember that DEFLATE_ON_OOM only makes<br>
it less likely to crash your guest, but not that you are safe to squeeze<br=
>
the last bit out of your guest VM.<br></blockquote><div>Can you elaborate o=
n the danger of DEFLATE_ON_OOM? I haven&#39;t seen any problems in testing =
but I&#39;d really like to know about the dangers.</div><div>Is there a dif=
ference in safety between the OOM notifier callback and the shrinker API?</=
div><div>=C2=A0<br></div><blockquote class=3D"gmail_quote" style=3D"margin:=
0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">
<br>
-- <br>
Thanks,<br>
<br>
David / dhildenb<br>
<br>
</blockquote></div></div>
</blockquote></div></div>

--000000000000f193dd059dc90f7b--

--===============5999778019002158300==
Content-Type: text/plain; charset="us-ascii"
MIME-Version: 1.0
Content-Transfer-Encoding: 7bit
Content-Disposition: inline

_______________________________________________
Virtualization mailing list
Virtualization@lists.linux-foundation.org
https://lists.linuxfoundation.org/mailman/listinfo/virtualization
--===============5999778019002158300==--