linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: David Hildenbrand <david@redhat.com>
To: Linux Memory Management List <linux-mm@kvack.org>
Cc: Minchan Kim <minchan@kernel.org>,
	Matthew Wilcox <willy@infradead.org>,
	Rik van Riel <riel@surriel.com>, Michal Hocko <mhocko@kernel.org>,
	Andrea Arcangeli <aarcange@redhat.com>,
	Peter Xu <peterx@redhat.com>, Vlastimil Babka <vbabka@suse.cz>,
	Yang Shi <yang.shi@linux.alibaba.com>,
	Balbir Singh <bsingharora@gmail.com>
Subject: Re: Page zapping and page table reclaim
Date: Wed, 24 Mar 2021 10:55:36 +0100	[thread overview]
Message-ID: <53e72516-2e38-f490-4d1f-709291140e2f@redhat.com> (raw)
In-Reply-To: <bae8b967-c206-819d-774c-f57b94c4b362@redhat.com>

On 11.03.21 19:14, David Hildenbrand wrote:
> Hi folks,
> 
> I was wondering, is there any mechanism that reclaims basically empty
> page tables in a running process?
> 
> Like: When I MADV_DONTNEED a huge range, there could be plenty of
> basically empty (e.g., all entries invalid) page tables we could
> reclaim. As soon as we zap a complete PMD we could reclaim (depending on
> the architecture) a whole page.
> 
> Zapping on the PMD level might make most impact I guess.
> 
> For 1 GB, we need 262144 4k pages. If we assume each PTE is 8 bytes, we
> need a total of 8 MB for the lowest level page tables (PTE).
> 
> OTOH, we would need 512 PMD entries - a single 4k page. Zapping 1 TB
> would mean we can free up another 4MB - rather a corner case and we can
> live with that.
> 
> 
> Of course, the same might apply to other cases where we can restore all
> page table content from the VMA again. One example would be after
> MADV_FREE zapped a whole range of entries we marked.
> 
> Looks like if we happen to zap a THP, we should already get what we want
> (no page table, nothing to remove)
> 
> I haven't immediately stumbled over anything, but could be I am missing
> the obvious. I guess what would need some thought is concurrent
> discards/pagefaults - but it feels like being similar to
> collapsing/splitting a THP while there is other system activity.
> 
> Maybe there is already something and I am just not aware of it.
> 
> Thanks!

Thanks for the feedback so far. I just did a very simple experiment:

1. Start a VM (QEMU) with 60 GB and populate/preallocate all page tables.
2. Inflate the memory balloon (virtio-balloon) in the VM to 58 GB
3. Wait until fully inflated

Before inflating the balloon: PageTables:       131760 kB
After inflating the balloon: No real change
Shutting down the VM: PageTables:         8064 kB

In comparison, starting a 2 GB VM and preallocating/populating all 
memory: PageTables:        12660 kB


So in this case, there is quite some room for improvements (> 100 MiB). 
virtio-balloon will discard in 4k granularity, which means, that we'll 
never get to zap whole THPs (the first discard will break up the THP), 
therefore, don't remove any page tables.

I'll try identifying other workloads/cases where such an optimization 
are applicable and work on asynchronous page table reclaim. Thanks!

-- 
Thanks,

David / dhildenb



      parent reply	other threads:[~2021-03-24  9:55 UTC|newest]

Thread overview: 13+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-03-11 18:14 Page zapping and page table reclaim David Hildenbrand
2021-03-11 21:26 ` Peter Xu
2021-03-11 21:35   ` David Hildenbrand
2021-03-19 17:04     ` Yang Shi
2021-03-22  9:34       ` David Hildenbrand
2021-03-18 16:57 ` Vlastimil Babka
2021-03-18 23:53   ` Balbir Singh
2021-03-19 12:44     ` David Hildenbrand
2021-03-20  1:56       ` Balbir Singh
2021-03-22  9:19         ` David Hildenbrand
2021-03-18 18:03 ` Rik van Riel
2021-03-18 18:15   ` David Hildenbrand
2021-03-24  9:55 ` David Hildenbrand [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=53e72516-2e38-f490-4d1f-709291140e2f@redhat.com \
    --to=david@redhat.com \
    --cc=aarcange@redhat.com \
    --cc=bsingharora@gmail.com \
    --cc=linux-mm@kvack.org \
    --cc=mhocko@kernel.org \
    --cc=minchan@kernel.org \
    --cc=peterx@redhat.com \
    --cc=riel@surriel.com \
    --cc=vbabka@suse.cz \
    --cc=willy@infradead.org \
    --cc=yang.shi@linux.alibaba.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).