damon.lists.linux.dev archive mirror
 help / color / mirror / Atom feed
From: David Hildenbrand <david@redhat.com>
To: David Rientjes <rientjes@google.com>
Cc: SeongJae Park <sj@kernel.org>,
	"T.J. Alumbaugh" <talumbau@google.com>,
	lsf-pc@lists.linux-foundation.org,
	"Sudarshan Rajagopalan (QUIC)" <quic_sudaraja@quicinc.com>,
	hch@lst.de, kai.huang@intel.com, jon@nutanix.com,
	Yuanchu Xie <yuanchu@google.com>, linux-mm <linux-mm@kvack.org>,
	damon@lists.linux.dev
Subject: Re: [LSF/MM/BPF TOPIC] VM Memory Overcommit
Date: Thu, 2 Mar 2023 10:32:18 +0100	[thread overview]
Message-ID: <e660ae94-7b7b-19a4-3748-60432ab389df@redhat.com> (raw)
In-Reply-To: <c57f3f06-5079-6e28-5238-c5731ee02a6e@google.com>

On 02.03.23 04:26, David Rientjes wrote:
> On Tue, 28 Feb 2023, David Hildenbrand wrote:
> 
>> On 28.02.23 23:38, SeongJae Park wrote:
>>> On Tue, 28 Feb 2023 10:20:57 +0100 David Hildenbrand <david@redhat.com>
>>> wrote:
>>>
>>>> On 23.02.23 00:59, T.J. Alumbaugh wrote:
>>>>> Hi,
>>>>>
>>>>> This topic proposal would be to present and discuss multiple MM
>>>>> features to improve host memory overcommit while running VMs. There
>>>>> are two general cases:
>>>>>
>>>>> 1. The host and its guests operate independently,
>>>>>
>>>>> 2. The host and its guests cooperate by techniques like ballooning.
>>>>>
>>>>> In the first case, we would discuss some new techniques, e.g., fast
>>>>> access bit harvesting in the KVM MMU, and some difficulties, e.g.,
>>>>> double zswapping.
>>>>>
>>>>> In the second case, we would like to discuss a novel working set size
>>>>> (WSS) notifier framework and some improvements to the ballooning
>>>>> policy. The WSS notifier, when available, can report WSS to its
>>>>> listeners. VM Memory Overcommit is one of its use cases: the
>>>>> virtio-balloon driver can register for WSS notifications and relay WSS
>>>>> to the host. The host can leverage the WSS notifications and improve
>>>>> the ballooning policy.
>>>>>
>>>>> This topic would be of interest to a wide range of audience, e.g.,
>>>>> phones, laptops and servers.
>>>>> Co-presented with Yuanchu Xie.
>>>>
>>>> In general, having the WSS available to the hypervisor might be
>>>> beneficial. I recall, that there was an idea to leverage MGLRU and to
>>>> communicate MGLRU statistics to the hypervisor, such that the hypervisor
>>>> can make decisions using these statistics.
>>>>
>>>> But note that I don't think that the future will be traditional memory
>>>> balloon inflation/deflation. I think it might be useful in related
>>>> context, though.
>>>>
>>>> What we actually might want is a way to tell the OS ruining inside the
>>>> VM to "please try not using more than XXX MiB of physical memory" but
>>>> treat it as a soft limit. So in case we mess up, or there is a sudden
>>>> peak in memory consumption due to a workload, we won't harm the guest
>>>> OS/workload, and don't have to act immediately to avoid trouble. One can
>>>> think of it like an evolution of memory ballooning: instead of creating
>>>> artificial memory pressure by inflating the balloon that is fairly event
>>>> driven and requires explicit memory deflation, we teach the OS to do it
>>>> natively and pair it with free page reporting.
>>>>
>>>> All free physical memory inside the VM can be reported using free page
>>>> reporting to the hypervisor, and the OS will try sticking to the
>>>> requested "logical" VM size, unless there is real demand for more memory.
>>>
>>> I think use of DAMON_RECLAIM[1] inside VM together with free pages reporting
>>> could be an option.  Some users tried that in a manual way and reported some
>>> positive results.  I'm trying to find a good way to provide some control of
>>> the
>>> in-VM DAMON_RECLAIM utilization to hypervisor.
>>>
>>
>> I think we might want to go one step further and not only reclaim
>> (pro)actively, but also limit e.g., the growth of caches, such as the
>> pagecache, to make them also aware of a soft-limit. Having that said, I still
>> have to learn more about DAMON reclaim :)
>>
> 
> I'm curious, is this limitation possible to impose with memcg today or are
> specifically looking to provide a cap on page cache, dentries, inodes,
> etc, without specifically requiring memcg?

Good question, I remember the last time that topic was raised, the 
common understanding was that existing mechanisms (i.e., memcg) were not 
sufficient. But I am no expert on this, so this sure sounds like a good 
topic to discuss in a bigger group, with hopefully some memcg experts 
around :)

-- 
Thanks,

David / dhildenb


      reply	other threads:[~2023-03-02  9:32 UTC|newest]

Thread overview: 4+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
     [not found] <d1c562e2-58a5-14b0-9db9-de1c492fe921@redhat.com>
2023-02-28 22:38 ` [LSF/MM/BPF TOPIC] VM Memory Overcommit SeongJae Park
2023-02-28 22:52   ` David Hildenbrand
2023-03-02  3:26     ` David Rientjes
2023-03-02  9:32       ` David Hildenbrand [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=e660ae94-7b7b-19a4-3748-60432ab389df@redhat.com \
    --to=david@redhat.com \
    --cc=damon@lists.linux.dev \
    --cc=hch@lst.de \
    --cc=jon@nutanix.com \
    --cc=kai.huang@intel.com \
    --cc=linux-mm@kvack.org \
    --cc=lsf-pc@lists.linux-foundation.org \
    --cc=quic_sudaraja@quicinc.com \
    --cc=rientjes@google.com \
    --cc=sj@kernel.org \
    --cc=talumbau@google.com \
    --cc=yuanchu@google.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).