kvm.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Claudio Imbrenda <imbrenda@linux.ibm.com>
To: David Hildenbrand <david@redhat.com>
Cc: kvm@vger.kernel.org, cohuck@redhat.com, borntraeger@de.ibm.com,
	frankja@linux.ibm.com, thuth@redhat.com, pasic@linux.ibm.com,
	linux-s390@vger.kernel.org, linux-kernel@vger.kernel.org,
	Ulrich.Weigand@de.ibm.com,
	"linux-mm@kvack.org" <linux-mm@kvack.org>,
	Michal Hocko <mhocko@kernel.org>
Subject: Re: [PATCH v3 00/14] KVM: s390: pv: implement lazy destroy
Date: Fri, 6 Aug 2021 11:30:05 +0200	[thread overview]
Message-ID: <20210806113005.0259d53c@p-imbrenda> (raw)
In-Reply-To: <86b114ef-41ea-04b6-327c-4a036f784fad@redhat.com>

On Fri, 6 Aug 2021 09:10:28 +0200
David Hildenbrand <david@redhat.com> wrote:

> On 04.08.21 17:40, Claudio Imbrenda wrote:
> > Previously, when a protected VM was rebooted or when it was shut
> > down, its memory was made unprotected, and then the protected VM
> > itself was destroyed. Looping over the whole address space can take
> > some time, considering the overhead of the various Ultravisor Calls
> > (UVCs). This means that a reboot or a shutdown would take a
> > potentially long amount of time, depending on the amount of used
> > memory.
> > 
> > This patchseries implements a deferred destroy mechanism for
> > protected guests. When a protected guest is destroyed, its memory
> > is cleared in background, allowing the guest to restart or
> > terminate significantly faster than before.
> > 
> > There are 2 possibilities when a protected VM is torn down:
> > * it still has an address space associated (reboot case)
> > * it does not have an address space anymore (shutdown case)
> > 
> > For the reboot case, the reference count of the mm is increased, and
> > then a background thread is started to clean up. Once the thread
> > went through the whole address space, the protected VM is actually
> > destroyed.  
> 
> That doesn't sound too hacky to me, and actually sounds like a good 
> idea, doing what the guest would do either way but speeding it up 
> asynchronously, but ...
> 
> > 
> > For the shutdown case, a list of pages to be destroyed is formed
> > when the mm is torn down. Instead of just unmapping the pages when
> > the address space is being torn down, they are also set aside.
> > Later when KVM cleans up the VM, a thread is started to clean up
> > the pages from the list.  
> 
> ... this ...
> 
> > 
> > This means that the same address space can have memory belonging to
> > more than one protected guest, although only one will be running,
> > the others will in fact not even have any CPUs.  
> 
> ... this ...

this ^ is exactly the reboot case.

> > When a guest is destroyed, its memory still counts towards its
> > memory control group until it's actually freed (I tested this
> > experimentally)
> > 
> > When the system runs out of memory, if a guest has terminated and
> > its memory is being cleaned asynchronously, the OOM killer will
> > wait a little and then see if memory has been freed. This has the
> > practical effect of slowing down memory allocations when the system
> > is out of memory to give the cleanup thread time to cleanup and
> > free memory, and avoid an actual OOM situation.  
> 
> ... and this sound like the kind of arch MM hacks that will bite us
> in the long run. Of course, I might be wrong, but already doing
> excessive GFP_ATOMIC allocations or messing with the OOM killer that

they are GFP_ATOMIC but they should not put too much weight on the
memory and can also fail without consequences, I used:

GFP_ATOMIC | __GFP_NOMEMALLOC | __GFP_NOWARN

also notice that after every page allocation a page gets freed, so this
is only temporary.

I would not call it "messing with the OOM killer", I'm using the same
interface used by virtio-baloon

> way for a pure (shutdown) optimization is an alarm signal. Of course,
> I might be wrong.
> 
> You should at least CC linux-mm. I'll do that right now and also CC 
> Michal. He might have time to have a quick glimpse at patch #11 and
> #13.
> 
> https://lkml.kernel.org/r/20210804154046.88552-12-imbrenda@linux.ibm.com
> https://lkml.kernel.org/r/20210804154046.88552-14-imbrenda@linux.ibm.com
> 
> IMHO, we should proceed with patch 1-10, as they solve a really 
> important problem ("slow reboots") in a nice way, whereby patch 11 
> handles a case that can be worked around comparatively easily by 
> management tools -- my 2 cents.

how would management tools work around the issue that a shutdown can
take very long?

also, without my patches, the shutdown case would use export instead of
destroy, making it even slower.


  reply	other threads:[~2021-08-06  9:34 UTC|newest]

Thread overview: 28+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-08-04 15:40 [PATCH v3 00/14] KVM: s390: pv: implement lazy destroy Claudio Imbrenda
2021-08-04 15:40 ` [PATCH v3 01/14] KVM: s390: pv: add macros for UVC CC values Claudio Imbrenda
2021-08-06  7:26   ` David Hildenbrand
2021-08-06  9:34     ` Claudio Imbrenda
2021-08-06 15:15     ` Janosch Frank
2021-08-04 15:40 ` [PATCH v3 02/14] KVM: s390: pv: avoid stall notifications for some UVCs Claudio Imbrenda
2021-08-06  7:30   ` David Hildenbrand
2021-08-06  9:33     ` Claudio Imbrenda
2021-08-04 15:40 ` [PATCH v3 03/14] KVM: s390: pv: leak the ASCE page when destroy fails Claudio Imbrenda
2021-08-06  7:31   ` David Hildenbrand
2021-08-06  9:32     ` Claudio Imbrenda
2021-08-06 11:39       ` David Hildenbrand
2021-08-04 15:40 ` [PATCH v3 04/14] KVM: s390: pv: properly handle page flags for protected guests Claudio Imbrenda
2021-08-04 15:40 ` [PATCH v3 05/14] KVM: s390: pv: handle secure storage violations " Claudio Imbrenda
2021-08-04 15:40 ` [PATCH v3 06/14] KVM: s390: pv: handle secure storage exceptions for normal guests Claudio Imbrenda
2021-08-04 15:40 ` [PATCH v3 07/14] KVM: s390: pv: refactor s390_reset_acc Claudio Imbrenda
2021-08-04 15:40 ` [PATCH v3 08/14] KVM: s390: pv: usage counter instead of flag Claudio Imbrenda
2021-08-04 15:40 ` [PATCH v3 09/14] KVM: s390: pv: add export before import Claudio Imbrenda
2021-08-04 15:40 ` [PATCH v3 10/14] KVM: s390: pv: lazy destroy for reboot Claudio Imbrenda
2021-08-04 15:40 ` [PATCH v3 11/14] KVM: s390: pv: extend lazy destroy to handle shutdown Claudio Imbrenda
2021-08-04 15:40 ` [PATCH v3 12/14] KVM: s390: pv: module parameter to fence lazy destroy Claudio Imbrenda
2021-08-04 15:40 ` [PATCH v3 13/14] KVM: s390: pv: add OOM notifier for " Claudio Imbrenda
2021-08-04 15:40 ` [PATCH v3 14/14] KVM: s390: pv: avoid export before import if possible Claudio Imbrenda
2021-08-06  7:10 ` [PATCH v3 00/14] KVM: s390: pv: implement lazy destroy David Hildenbrand
2021-08-06  9:30   ` Claudio Imbrenda [this message]
2021-08-06 11:30     ` David Hildenbrand
2021-08-06 13:44       ` Claudio Imbrenda
2021-08-09  8:50         ` David Hildenbrand

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20210806113005.0259d53c@p-imbrenda \
    --to=imbrenda@linux.ibm.com \
    --cc=Ulrich.Weigand@de.ibm.com \
    --cc=borntraeger@de.ibm.com \
    --cc=cohuck@redhat.com \
    --cc=david@redhat.com \
    --cc=frankja@linux.ibm.com \
    --cc=kvm@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=linux-s390@vger.kernel.org \
    --cc=mhocko@kernel.org \
    --cc=pasic@linux.ibm.com \
    --cc=thuth@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).