All of lore.kernel.org
 help / color / mirror / Atom feed
From: "Dr. David Alan Gilbert" <dgilbert@redhat.com>
To: David Hildenbrand <david@redhat.com>
Cc: qemu-devel@nongnu.org, kvm@vger.kernel.org,
	qemu-s390x@nongnu.org, Richard Henderson <rth@twiddle.net>,
	Paolo Bonzini <pbonzini@redhat.com>,
	Eduardo Habkost <ehabkost@redhat.com>,
	"Michael S . Tsirkin" <mst@redhat.com>
Subject: Re: [PATCH v1 01/17] exec: Introduce ram_block_discard_set_(unreliable|required)()
Date: Fri, 15 May 2020 10:54:13 +0100	[thread overview]
Message-ID: <20200515095413.GB2954@work-vm> (raw)
In-Reply-To: <20200506094948.76388-2-david@redhat.com>

* David Hildenbrand (david@redhat.com) wrote:
> We want to replace qemu_balloon_inhibit() by something more generic.
> Especially, we want to make sure that technologies that really rely on
> RAM block discards to work reliably to run mutual exclusive with
> technologies that break it.
> 
> E.g., vfio will usually pin all guest memory, turning the virtio-balloon
> basically useless and make the VM consume more memory than reported via
> the balloon. While the balloon is special already (=> no guarantees, same
> behavior possible afer reboots and with huge pages), this will be
> different, especially, with virtio-mem.
> 
> Let's implement a way such that we can make both types of technology run
> mutually exclusive. We'll convert existing balloon inhibitors in successive
> patches and add some new ones. Add the check to
> qemu_balloon_is_inhibited() for now. We might want to make
> virtio-balloon an acutal inhibitor in the future - however, that
> requires more thought to not break existing setups.
> 
> Cc: "Michael S. Tsirkin" <mst@redhat.com>
> Cc: Richard Henderson <rth@twiddle.net>
> Cc: Paolo Bonzini <pbonzini@redhat.com>
> Signed-off-by: David Hildenbrand <david@redhat.com>
> ---
>  balloon.c             |  3 ++-
>  exec.c                | 48 +++++++++++++++++++++++++++++++++++++++++++
>  include/exec/memory.h | 41 ++++++++++++++++++++++++++++++++++++
>  3 files changed, 91 insertions(+), 1 deletion(-)
> 
> diff --git a/balloon.c b/balloon.c
> index f104b42961..c49f57c27b 100644
> --- a/balloon.c
> +++ b/balloon.c
> @@ -40,7 +40,8 @@ static int balloon_inhibit_count;
>  
>  bool qemu_balloon_is_inhibited(void)
>  {
> -    return atomic_read(&balloon_inhibit_count) > 0;
> +    return atomic_read(&balloon_inhibit_count) > 0 ||
> +           ram_block_discard_is_broken();
>  }
>  
>  void qemu_balloon_inhibit(bool state)
> diff --git a/exec.c b/exec.c
> index 2874bb5088..52a6e40e99 100644
> --- a/exec.c
> +++ b/exec.c
> @@ -4049,4 +4049,52 @@ void mtree_print_dispatch(AddressSpaceDispatch *d, MemoryRegion *root)
>      }
>  }
>  
> +static int ram_block_discard_broken;

This could do with a comment; if I'm reading this right then
  +ve means broken
  -ve means required

> +int ram_block_discard_set_broken(bool state)
> +{
> +    int old;
> +
> +    if (!state) {
> +        atomic_dec(&ram_block_discard_broken);
> +        return 0;
> +    }
> +
> +    do {
> +        old = atomic_read(&ram_block_discard_broken);
> +        if (old < 0) {
               /* Currently required */
> +            return -EBUSY;
> +        }
> +    } while (atomic_cmpxchg(&ram_block_discard_broken, old, old + 1) != old);
> +    return 0;
> +}
> +
> +int ram_block_discard_set_required(bool state)
> +{
> +    int old;
> +
> +    if (!state) {
> +        atomic_inc(&ram_block_discard_broken);
> +        return 0;
> +    }
> +
> +    do {
> +        old = atomic_read(&ram_block_discard_broken);
> +        if (old > 0) {
               /* Currently broken */
> +            return -EBUSY;
> +        }
> +    } while (atomic_cmpxchg(&ram_block_discard_broken, old, old - 1) != old);
> +    return 0;
> +}
> +
> +bool ram_block_discard_is_broken(void)
> +{
> +    return atomic_read(&ram_block_discard_broken) > 0;
> +}
> +
> +bool ram_block_discard_is_required(void)
> +{
> +    return atomic_read(&ram_block_discard_broken) < 0;
> +}
> +
>  #endif
> diff --git a/include/exec/memory.h b/include/exec/memory.h
> index e000bd2f97..9bb5ced38d 100644
> --- a/include/exec/memory.h
> +++ b/include/exec/memory.h
> @@ -2463,6 +2463,47 @@ static inline MemOp devend_memop(enum device_endian end)
>  }
>  #endif
>  
> +/*
> + * Inhibit technologies that rely on discarding of parts of RAM blocks to work
> + * reliably, e.g., to manage the actual amount of memory consumed by the VM
> + * (then, the memory provided by RAM blocks might be bigger than the desired
> + * memory consumption). This *must* be set if:

'technologies that rely on discarding of parts of RAM blocks to work
reliably' is pretty long; I'm not sure of a better way of saying it
though.

Other than the comments;


Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>

> + * - Discarding parts of a RAM blocks does not result in the change being
> + *   reflected in the VM and the pages getting freed.
> + * - All memory in RAM blocks is pinned or duplicated, invaldiating any previous
> + *   discards blindly.
> + * - Discarding parts of a RAM blocks will result in integrity issues (e.g.,
> + *   encrypted VMs).
> + * Technologies that only temporarily pin the current working set of a
> + * driver are fine, because we don't expect such pages to be discarded
> + * (esp. based on guest action like balloon inflation).
> + *
> + * This is *not* to be used to protect from concurrent discards (esp.,
> + * postcopy).
> + *
> + * Returns 0 if successful. Returns -EBUSY if a technology that relies on
> + * discards to work reliably is active.
> + */
> +int ram_block_discard_set_broken(bool state);
> +
> +/*
> + * Inhibit technologies that will break discarding of pages in RAM blocks.
> + *
> + * Returns 0 if successful. Returns -EBUSY if discards are already set to
> + * broken.
> + */
> +int ram_block_discard_set_required(bool state);
> +
> +/*
> + * Test if discarding of memory in ram blocks is broken.
> + */
> +bool ram_block_discard_is_broken(void);
> +
> +/*
> + * Test if discarding of memory in ram blocks is required to work reliably.
> + */
> +bool ram_block_discard_is_required(void);
> +
>  #endif
>  
>  #endif
> -- 
> 2.25.3
> 
--
Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK


WARNING: multiple messages have this Message-ID (diff)
From: "Dr. David Alan Gilbert" <dgilbert@redhat.com>
To: David Hildenbrand <david@redhat.com>
Cc: Eduardo Habkost <ehabkost@redhat.com>,
	kvm@vger.kernel.org, "Michael S . Tsirkin" <mst@redhat.com>,
	qemu-devel@nongnu.org, qemu-s390x@nongnu.org,
	Paolo Bonzini <pbonzini@redhat.com>,
	Richard Henderson <rth@twiddle.net>
Subject: Re: [PATCH v1 01/17] exec: Introduce ram_block_discard_set_(unreliable|required)()
Date: Fri, 15 May 2020 10:54:13 +0100	[thread overview]
Message-ID: <20200515095413.GB2954@work-vm> (raw)
In-Reply-To: <20200506094948.76388-2-david@redhat.com>

* David Hildenbrand (david@redhat.com) wrote:
> We want to replace qemu_balloon_inhibit() by something more generic.
> Especially, we want to make sure that technologies that really rely on
> RAM block discards to work reliably to run mutual exclusive with
> technologies that break it.
> 
> E.g., vfio will usually pin all guest memory, turning the virtio-balloon
> basically useless and make the VM consume more memory than reported via
> the balloon. While the balloon is special already (=> no guarantees, same
> behavior possible afer reboots and with huge pages), this will be
> different, especially, with virtio-mem.
> 
> Let's implement a way such that we can make both types of technology run
> mutually exclusive. We'll convert existing balloon inhibitors in successive
> patches and add some new ones. Add the check to
> qemu_balloon_is_inhibited() for now. We might want to make
> virtio-balloon an acutal inhibitor in the future - however, that
> requires more thought to not break existing setups.
> 
> Cc: "Michael S. Tsirkin" <mst@redhat.com>
> Cc: Richard Henderson <rth@twiddle.net>
> Cc: Paolo Bonzini <pbonzini@redhat.com>
> Signed-off-by: David Hildenbrand <david@redhat.com>
> ---
>  balloon.c             |  3 ++-
>  exec.c                | 48 +++++++++++++++++++++++++++++++++++++++++++
>  include/exec/memory.h | 41 ++++++++++++++++++++++++++++++++++++
>  3 files changed, 91 insertions(+), 1 deletion(-)
> 
> diff --git a/balloon.c b/balloon.c
> index f104b42961..c49f57c27b 100644
> --- a/balloon.c
> +++ b/balloon.c
> @@ -40,7 +40,8 @@ static int balloon_inhibit_count;
>  
>  bool qemu_balloon_is_inhibited(void)
>  {
> -    return atomic_read(&balloon_inhibit_count) > 0;
> +    return atomic_read(&balloon_inhibit_count) > 0 ||
> +           ram_block_discard_is_broken();
>  }
>  
>  void qemu_balloon_inhibit(bool state)
> diff --git a/exec.c b/exec.c
> index 2874bb5088..52a6e40e99 100644
> --- a/exec.c
> +++ b/exec.c
> @@ -4049,4 +4049,52 @@ void mtree_print_dispatch(AddressSpaceDispatch *d, MemoryRegion *root)
>      }
>  }
>  
> +static int ram_block_discard_broken;

This could do with a comment; if I'm reading this right then
  +ve means broken
  -ve means required

> +int ram_block_discard_set_broken(bool state)
> +{
> +    int old;
> +
> +    if (!state) {
> +        atomic_dec(&ram_block_discard_broken);
> +        return 0;
> +    }
> +
> +    do {
> +        old = atomic_read(&ram_block_discard_broken);
> +        if (old < 0) {
               /* Currently required */
> +            return -EBUSY;
> +        }
> +    } while (atomic_cmpxchg(&ram_block_discard_broken, old, old + 1) != old);
> +    return 0;
> +}
> +
> +int ram_block_discard_set_required(bool state)
> +{
> +    int old;
> +
> +    if (!state) {
> +        atomic_inc(&ram_block_discard_broken);
> +        return 0;
> +    }
> +
> +    do {
> +        old = atomic_read(&ram_block_discard_broken);
> +        if (old > 0) {
               /* Currently broken */
> +            return -EBUSY;
> +        }
> +    } while (atomic_cmpxchg(&ram_block_discard_broken, old, old - 1) != old);
> +    return 0;
> +}
> +
> +bool ram_block_discard_is_broken(void)
> +{
> +    return atomic_read(&ram_block_discard_broken) > 0;
> +}
> +
> +bool ram_block_discard_is_required(void)
> +{
> +    return atomic_read(&ram_block_discard_broken) < 0;
> +}
> +
>  #endif
> diff --git a/include/exec/memory.h b/include/exec/memory.h
> index e000bd2f97..9bb5ced38d 100644
> --- a/include/exec/memory.h
> +++ b/include/exec/memory.h
> @@ -2463,6 +2463,47 @@ static inline MemOp devend_memop(enum device_endian end)
>  }
>  #endif
>  
> +/*
> + * Inhibit technologies that rely on discarding of parts of RAM blocks to work
> + * reliably, e.g., to manage the actual amount of memory consumed by the VM
> + * (then, the memory provided by RAM blocks might be bigger than the desired
> + * memory consumption). This *must* be set if:

'technologies that rely on discarding of parts of RAM blocks to work
reliably' is pretty long; I'm not sure of a better way of saying it
though.

Other than the comments;


Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>

> + * - Discarding parts of a RAM blocks does not result in the change being
> + *   reflected in the VM and the pages getting freed.
> + * - All memory in RAM blocks is pinned or duplicated, invaldiating any previous
> + *   discards blindly.
> + * - Discarding parts of a RAM blocks will result in integrity issues (e.g.,
> + *   encrypted VMs).
> + * Technologies that only temporarily pin the current working set of a
> + * driver are fine, because we don't expect such pages to be discarded
> + * (esp. based on guest action like balloon inflation).
> + *
> + * This is *not* to be used to protect from concurrent discards (esp.,
> + * postcopy).
> + *
> + * Returns 0 if successful. Returns -EBUSY if a technology that relies on
> + * discards to work reliably is active.
> + */
> +int ram_block_discard_set_broken(bool state);
> +
> +/*
> + * Inhibit technologies that will break discarding of pages in RAM blocks.
> + *
> + * Returns 0 if successful. Returns -EBUSY if discards are already set to
> + * broken.
> + */
> +int ram_block_discard_set_required(bool state);
> +
> +/*
> + * Test if discarding of memory in ram blocks is broken.
> + */
> +bool ram_block_discard_is_broken(void);
> +
> +/*
> + * Test if discarding of memory in ram blocks is required to work reliably.
> + */
> +bool ram_block_discard_is_required(void);
> +
>  #endif
>  
>  #endif
> -- 
> 2.25.3
> 
--
Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK



  reply	other threads:[~2020-05-15  9:54 UTC|newest]

Thread overview: 94+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-05-06  9:49 [PATCH v1 00/17] virtio-mem: Paravirtualized memory hot(un)plug David Hildenbrand
2020-05-06  9:49 ` David Hildenbrand
2020-05-06  9:49 ` [PATCH v1 01/17] exec: Introduce ram_block_discard_set_(unreliable|required)() David Hildenbrand
2020-05-06  9:49   ` David Hildenbrand
2020-05-15  9:54   ` Dr. David Alan Gilbert [this message]
2020-05-15  9:54     ` Dr. David Alan Gilbert
2020-05-15 14:40     ` David Hildenbrand
2020-05-15 14:40       ` David Hildenbrand
2020-05-15 14:54   ` David Hildenbrand
2020-05-15 14:54     ` David Hildenbrand
2020-05-15 16:15     ` Dr. David Alan Gilbert
2020-05-15 16:15       ` Dr. David Alan Gilbert
2020-05-06  9:49 ` [PATCH v1 02/17] vfio: Convert to ram_block_discard_set_broken() David Hildenbrand
2020-05-06  9:49   ` David Hildenbrand
2020-05-15 12:01   ` David Hildenbrand
2020-05-15 12:01     ` David Hildenbrand
2020-05-06  9:49 ` [PATCH v1 03/17] accel/kvm: " David Hildenbrand
2020-05-06  9:49   ` David Hildenbrand
2020-05-15 11:57   ` Dr. David Alan Gilbert
2020-05-15 11:57     ` Dr. David Alan Gilbert
2020-05-06  9:49 ` [PATCH v1 04/17] s390x/pv: " David Hildenbrand
2020-05-06  9:49   ` David Hildenbrand
2020-05-06  9:49 ` [PATCH v1 05/17] virtio-balloon: Rip out qemu_balloon_inhibit() David Hildenbrand
2020-05-06  9:49   ` David Hildenbrand
2020-05-15 12:09   ` Dr. David Alan Gilbert
2020-05-15 12:09     ` Dr. David Alan Gilbert
2020-05-15 12:12     ` David Hildenbrand
2020-05-15 12:12       ` David Hildenbrand
2020-05-06  9:49 ` [PATCH v1 06/17] target/i386: sev: Use ram_block_discard_set_broken() David Hildenbrand
2020-05-06  9:49   ` David Hildenbrand
2020-05-15 15:51   ` Dr. David Alan Gilbert
2020-05-15 15:51     ` Dr. David Alan Gilbert
2020-05-06  9:49 ` [PATCH v1 07/17] migration/rdma: " David Hildenbrand
2020-05-06  9:49   ` David Hildenbrand
2020-05-15 12:45   ` Dr. David Alan Gilbert
2020-05-15 12:45     ` Dr. David Alan Gilbert
2020-05-15 14:09     ` David Hildenbrand
2020-05-15 14:09       ` David Hildenbrand
2020-05-15 17:51       ` Dr. David Alan Gilbert
2020-05-15 17:51         ` Dr. David Alan Gilbert
2020-05-15 17:59         ` David Hildenbrand
2020-05-15 17:59           ` David Hildenbrand
2020-05-15 18:36           ` Dr. David Alan Gilbert
2020-05-15 18:36             ` Dr. David Alan Gilbert
2020-05-18 13:52             ` David Hildenbrand
2020-05-18 13:52               ` David Hildenbrand
2020-05-06  9:49 ` [PATCH v1 08/17] migration/colo: " David Hildenbrand
2020-05-06  9:49   ` David Hildenbrand
2020-05-15 13:58   ` Dr. David Alan Gilbert
2020-05-15 13:58     ` Dr. David Alan Gilbert
2020-05-15 14:05     ` David Hildenbrand
2020-05-15 14:05       ` David Hildenbrand
2020-05-06  9:49 ` [PATCH v1 09/17] linux-headers: update to contain virtio-mem David Hildenbrand
2020-05-06  9:49   ` David Hildenbrand
2020-05-06  9:49 ` [PATCH v1 10/17] virtio-mem: Paravirtualized memory hot(un)plug David Hildenbrand
2020-05-06  9:49   ` David Hildenbrand
2020-05-06 16:12   ` Eric Blake
2020-05-06 16:12     ` Eric Blake
2020-05-06 16:14     ` David Hildenbrand
2020-05-06 16:14       ` David Hildenbrand
2020-05-15 15:37   ` Dr. David Alan Gilbert
2020-05-15 15:37     ` Dr. David Alan Gilbert
2020-05-15 16:48     ` David Hildenbrand
2020-05-15 16:48       ` David Hildenbrand
2020-05-18 14:23       ` David Hildenbrand
2020-05-18 14:23         ` David Hildenbrand
2020-05-06  9:49 ` [PATCH v1 11/17] virtio-pci: Proxy for virtio-mem David Hildenbrand
2020-05-06  9:49   ` David Hildenbrand
2020-05-06 18:57   ` Pankaj Gupta
2020-05-06 18:57     ` Pankaj Gupta
2020-05-18 13:34     ` David Hildenbrand
2020-05-18 13:34       ` David Hildenbrand
2020-05-06  9:49 ` [PATCH v1 12/17] MAINTAINERS: Add myself as virtio-mem maintainer David Hildenbrand
2020-05-06  9:49   ` David Hildenbrand
2020-05-15 15:55   ` Dr. David Alan Gilbert
2020-05-15 15:55     ` Dr. David Alan Gilbert
2020-05-06  9:49 ` [PATCH v1 13/17] hmp: Handle virtio-mem when printing memory device info David Hildenbrand
2020-05-06  9:49   ` David Hildenbrand
2020-05-06 19:03   ` Pankaj Gupta
2020-05-06 19:03     ` Pankaj Gupta
2020-05-06  9:49 ` [PATCH v1 14/17] numa: Handle virtio-mem in NUMA stats David Hildenbrand
2020-05-06  9:49   ` David Hildenbrand
2020-05-06  9:49 ` [PATCH v1 15/17] pc: Support for virtio-mem-pci David Hildenbrand
2020-05-06  9:49   ` David Hildenbrand
2020-05-06 12:19   ` Pankaj Gupta
2020-05-06 12:19     ` Pankaj Gupta
2020-05-06  9:49 ` [PATCH v1 16/17] virtio-mem: Allow notifiers for size changes David Hildenbrand
2020-05-06  9:49   ` David Hildenbrand
2020-05-15 16:46   ` Dr. David Alan Gilbert
2020-05-15 16:46     ` Dr. David Alan Gilbert
2020-05-06  9:49 ` [PATCH v1 17/17] virtio-pci: Send qapi events when the virtio-mem " David Hildenbrand
2020-05-06  9:49   ` David Hildenbrand
2020-05-15 15:18   ` David Hildenbrand
2020-05-15 15:18     ` David Hildenbrand

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20200515095413.GB2954@work-vm \
    --to=dgilbert@redhat.com \
    --cc=david@redhat.com \
    --cc=ehabkost@redhat.com \
    --cc=kvm@vger.kernel.org \
    --cc=mst@redhat.com \
    --cc=pbonzini@redhat.com \
    --cc=qemu-devel@nongnu.org \
    --cc=qemu-s390x@nongnu.org \
    --cc=rth@twiddle.net \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.