All of lore.kernel.org
 help / color / mirror / Atom feed
From: Christian Borntraeger <borntraeger@de.ibm.com>
To: "Michael S. Tsirkin" <mst@redhat.com>, "Denis V. Lunev" <den@openvz.org>
Cc: James.Bottomley@HansenPartnership.com, qemu-devel@nongnu.org,
	Raushaniya Maksudova <rmaksudova@virtuozzo.com>,
	Anthony Liguori <aliguori@amazon.com>
Subject: Re: [Qemu-devel] [PATCH 1/1] balloon: add a feature bit to let Guest OS deflate balloon on oom
Date: Fri, 12 Jun 2015 13:56:37 +0200	[thread overview]
Message-ID: <557AC8F5.6040105@de.ibm.com> (raw)
In-Reply-To: <20150610151113-mutt-send-email-mst@redhat.com>

Am 10.06.2015 um 15:13 schrieb Michael S. Tsirkin:
> On Wed, Jun 10, 2015 at 03:02:21PM +0300, Denis V. Lunev wrote:
>> On 09/06/15 13:37, Christian Borntraeger wrote:
>>> Am 09.06.2015 um 12:19 schrieb Denis V. Lunev:
>>>> Excessive virtio_balloon inflation can cause invocation of OOM-killer,
>>>> when Linux is under severe memory pressure. Various mechanisms are
>>>> responsible for correct virtio_balloon memory management. Nevertheless it
>>>> is often the case that these control tools does not have enough time to
>>>> react on fast changing memory load. As a result OS runs out of memory and
>>>> invokes OOM-killer. The balancing of memory by use of the virtio balloon
>>>> should not cause the termination of processes while there are pages in the
>>>> balloon. Now there is no way for virtio balloon driver to free memory at
>>>> the last moment before some process get killed by OOM-killer.
>>>>
>>>> This does not provide a security breach as balloon itself is running
>>>> inside Guest OS and is working in the cooperation with the host. Thus
>>>> some improvements from Guest side should be considered as normal.
>>>>
>>>> To solve the problem, introduce a virtio_balloon callback which is
>>>> expected to be called from the oom notifier call chain in out_of_memory()
>>>> function. If virtio balloon could release some memory, it will make the
>>>> system return and retry the allocation that forced the out of memory
>>>> killer to run.
>>>>
>>>> This behavior should be enabled if and only if appropriate feature bit
>>>> is set on the device. It is off by default.
>>> The balloon frees pages in this way
>>>
>>> static void balloon_page(void *addr, int deflate)
>>> {
>>> #if defined(__linux__)
>>>     if (!kvm_enabled() || kvm_has_sync_mmu())
>>>         qemu_madvise(addr, TARGET_PAGE_SIZE,
>>>                 deflate ? QEMU_MADV_WILLNEED : QEMU_MADV_DONTNEED);
>>> #endif
>>> }
>>>
>>> The guest can re-touch that page and get a empty zero or the old page back without
>>> tampering the host integrity. This should work for all cases I am aware of (without sync_mmu its a nop anyway) so why not enable that by default? Anything that I missed?
>>>
>>> Christian
>>
>> I'd like to do that :) Actually original version of kernel patch
>> has enabled this unconditionally. But Michael asked to make
>> it configurable and off by default.
>>
>> Den
> 
> That's not the question here.  The question is why is it limited by kvm_has_sync_mmu.

Well we have two interesting options here:

VIRTIO_BALLOON_F_MUST_TELL_HOST and VIRTIO_BALLOON_F_DEFLATE_ON_OOM

For any sane host with ondemand paging just re-accessing the page
should simply work. So the common case could be
VIRTIO_BALLOON_F_MUST_TELL_HOST == off
VIRTIO_BALLOON_F_DEFLATE_ON_OOM == on

Only for the rare case of hypervisors without paging or other memory
related restrictions we have to enable MUST_TELL_HOST.
Now: QEMU knows exactly which case we have, so why not let QEMU tell
the guest what the capabilities are. (e.g. sync_mmu ---> no need to 
tell the host).

I can at least imaging that some admin wants to make the the oom case
configurable, but a sane default seems to be to not kill random
guest processes.

Christian

  parent reply	other threads:[~2015-06-12 11:56 UTC|newest]

Thread overview: 16+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2015-06-09 10:19 [Qemu-devel] [PATCH v6 0/1] balloon: add a feature bit to let Guest OS deflate Denis V. Lunev
2015-06-09 10:19 ` [Qemu-devel] [PATCH 1/1] balloon: add a feature bit to let Guest OS deflate balloon on oom Denis V. Lunev
2015-06-09 10:37   ` Christian Borntraeger
2015-06-10 12:02     ` Denis V. Lunev
2015-06-10 13:13       ` Michael S. Tsirkin
2015-06-10 13:27         ` Denis V. Lunev
2015-06-12 11:56         ` Christian Borntraeger [this message]
2015-06-13 20:10           ` Michael S. Tsirkin
2015-06-15  7:01             ` Christian Borntraeger
2015-06-15  9:06               ` Michael S. Tsirkin
2015-06-15  9:59                 ` Christian Borntraeger
2015-06-15 10:10                   ` Michael S. Tsirkin
2015-06-15 11:10                     ` Christian Borntraeger
2015-06-09 10:37   ` Denis V. Lunev
2015-06-15 10:52 [Qemu-devel] [PATCH v7 0/1] balloon: add a feature bit to let Guest OS deflate Denis V. Lunev
2015-06-15 10:52 ` [Qemu-devel] [PATCH 1/1] balloon: add a feature bit to let Guest OS deflate balloon on oom Denis V. Lunev
2015-06-23 13:36   ` Christian Borntraeger

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=557AC8F5.6040105@de.ibm.com \
    --to=borntraeger@de.ibm.com \
    --cc=James.Bottomley@HansenPartnership.com \
    --cc=aliguori@amazon.com \
    --cc=den@openvz.org \
    --cc=mst@redhat.com \
    --cc=qemu-devel@nongnu.org \
    --cc=rmaksudova@virtuozzo.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.