All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH v25 QEMU 0/3] virtio-balloon: add support for page poison and free page reporting
@ 2020-05-27  4:13 ` Alexander Duyck
  0 siblings, 0 replies; 44+ messages in thread
From: Alexander Duyck @ 2020-05-27  4:13 UTC (permalink / raw)
  To: david, mst; +Cc: virtio-dev, qemu-devel

This series provides an asynchronous means of reporting free guest pages
to QEMU through virtio-balloon so that the memory associated with those
pages can be dropped and reused by other processes and/or guests on the
host. Using this it is possible to avoid unnecessary I/O to disk and
greatly improve performance in the case of memory overcommit on the host.

I originally submitted this patch series back on February 11th 2020[1],
but at that time I was focused primarily on the kernel portion of this
patch set. However as of April 7th those patches are now included in
Linus's kernel tree[2] and so I am submitting the QEMU pieces for
inclusion.

[1]: https://lore.kernel.org/lkml/20200211224416.29318.44077.stgit@localhost.localdomain/
[2]: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=b0c504f154718904ae49349147e3b7e6ae91ffdc

Changes from v17:
Fixed typo in patch 1 title
Addressed white-space issues reported via checkpatch
Added braces {} for two if statements to match expected coding style

Changes from v18:
Updated patches 2 and 3 based on input from dhildenb
Added comment to patch 2 describing what keeps us from reporting a bad page
Added patch to address issue with ROM devices being directly writable

Changes from v19:
Added std-headers change to match changes pushed for linux kernel headers
Added patch to remove "report" from page hinting code paths
Updated comment to better explain why we disable hints w/ page poisoning
Removed code that was modifying config size for poison vs hinting
Dropped x-page-poison property
Added code to bounds check the reported region vs the RAM block
Dropped patch for ROM devices as that was already pulled in by Paolo

Changes from v20:
Rearranged patches to push Linux header sync patches to front
Removed association between free page hinting and VIRTIO_BALLOON_F_PAGE_POISON
Added code to enable VIRTIO_BALLOON_F_PAGE_POISON if page reporting is enabled
Fixed possible resource leak if poison or qemu_balloon_is_inhibited return true

Changes from v21:
Added ack for patch 3
Rewrote patch description for page poison reporting feature
Made page-poison independent property and set to enabled by default
Added logic to migrate poison_val
Added several comments in code to better explain features
Switched free-page-reporting property to disabled by default

Changes from v22:
Added ack for patches 4 & 5
Added additional comment fixes in patch 3 to remove "reporting" reference
Renamed rvq in patch 5 to reporting_vq to better match linux kernel
Moved call adding reporting_vq to after free_page_vq

Changes from v23:
Rebased on latest QEMU
Dropped patches 1 & 2 as Linux kernel headers were synced
Added compat machine code for page-poison feature

Changes from v24:
Moved free page hinting rename to end of set as feature may be removed entirely
Added code to cleanup reporting_vq

---

Alexander Duyck (3):
      virtio-balloon: Implement support for page poison reporting feature
      virtio-balloon: Provide an interface for free page reporting
      virtio-balloon: Replace free page hinting references to 'report' with 'hint'


 hw/core/machine.c                  |    4 +
 hw/virtio/virtio-balloon.c         |  179 ++++++++++++++++++++++++++++--------
 include/hw/virtio/virtio-balloon.h |   23 ++---
 3 files changed, 155 insertions(+), 51 deletions(-)

--


^ permalink raw reply	[flat|nested] 44+ messages in thread

* [virtio-dev] [PATCH v25 QEMU 0/3] virtio-balloon: add support for page poison and free page reporting
@ 2020-05-27  4:13 ` Alexander Duyck
  0 siblings, 0 replies; 44+ messages in thread
From: Alexander Duyck @ 2020-05-27  4:13 UTC (permalink / raw)
  To: david, mst; +Cc: virtio-dev, qemu-devel

This series provides an asynchronous means of reporting free guest pages
to QEMU through virtio-balloon so that the memory associated with those
pages can be dropped and reused by other processes and/or guests on the
host. Using this it is possible to avoid unnecessary I/O to disk and
greatly improve performance in the case of memory overcommit on the host.

I originally submitted this patch series back on February 11th 2020[1],
but at that time I was focused primarily on the kernel portion of this
patch set. However as of April 7th those patches are now included in
Linus's kernel tree[2] and so I am submitting the QEMU pieces for
inclusion.

[1]: https://lore.kernel.org/lkml/20200211224416.29318.44077.stgit@localhost.localdomain/
[2]: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=b0c504f154718904ae49349147e3b7e6ae91ffdc

Changes from v17:
Fixed typo in patch 1 title
Addressed white-space issues reported via checkpatch
Added braces {} for two if statements to match expected coding style

Changes from v18:
Updated patches 2 and 3 based on input from dhildenb
Added comment to patch 2 describing what keeps us from reporting a bad page
Added patch to address issue with ROM devices being directly writable

Changes from v19:
Added std-headers change to match changes pushed for linux kernel headers
Added patch to remove "report" from page hinting code paths
Updated comment to better explain why we disable hints w/ page poisoning
Removed code that was modifying config size for poison vs hinting
Dropped x-page-poison property
Added code to bounds check the reported region vs the RAM block
Dropped patch for ROM devices as that was already pulled in by Paolo

Changes from v20:
Rearranged patches to push Linux header sync patches to front
Removed association between free page hinting and VIRTIO_BALLOON_F_PAGE_POISON
Added code to enable VIRTIO_BALLOON_F_PAGE_POISON if page reporting is enabled
Fixed possible resource leak if poison or qemu_balloon_is_inhibited return true

Changes from v21:
Added ack for patch 3
Rewrote patch description for page poison reporting feature
Made page-poison independent property and set to enabled by default
Added logic to migrate poison_val
Added several comments in code to better explain features
Switched free-page-reporting property to disabled by default

Changes from v22:
Added ack for patches 4 & 5
Added additional comment fixes in patch 3 to remove "reporting" reference
Renamed rvq in patch 5 to reporting_vq to better match linux kernel
Moved call adding reporting_vq to after free_page_vq

Changes from v23:
Rebased on latest QEMU
Dropped patches 1 & 2 as Linux kernel headers were synced
Added compat machine code for page-poison feature

Changes from v24:
Moved free page hinting rename to end of set as feature may be removed entirely
Added code to cleanup reporting_vq

---

Alexander Duyck (3):
      virtio-balloon: Implement support for page poison reporting feature
      virtio-balloon: Provide an interface for free page reporting
      virtio-balloon: Replace free page hinting references to 'report' with 'hint'


 hw/core/machine.c                  |    4 +
 hw/virtio/virtio-balloon.c         |  179 ++++++++++++++++++++++++++++--------
 include/hw/virtio/virtio-balloon.h |   23 ++---
 3 files changed, 155 insertions(+), 51 deletions(-)

--

---------------------------------------------------------------------
To unsubscribe, e-mail: virtio-dev-unsubscribe@lists.oasis-open.org
For additional commands, e-mail: virtio-dev-help@lists.oasis-open.org


^ permalink raw reply	[flat|nested] 44+ messages in thread

* [PATCH v25 QEMU 1/3] virtio-balloon: Implement support for page poison reporting feature
  2020-05-27  4:13 ` [virtio-dev] " Alexander Duyck
@ 2020-05-27  4:14   ` Alexander Duyck
  -1 siblings, 0 replies; 44+ messages in thread
From: Alexander Duyck @ 2020-05-27  4:14 UTC (permalink / raw)
  To: david, mst; +Cc: virtio-dev, qemu-devel

From: Alexander Duyck <alexander.h.duyck@linux.intel.com>

We need to make certain to advertise support for page poison reporting if
we want to actually get data on if the guest will be poisoning pages.

Add a value for reporting the poison value being used if page poisoning is
enabled in the guest. With this we can determine if we will need to skip
free page reporting when it is enabled in the future.

The value currently has no impact on existing balloon interfaces. In the
case of existing balloon interfaces the onus is on the guest driver to
reapply whatever poison is in place.

When we add free page reporting the poison value is used to determine if
we can perform in-place page reporting. The expectation is that a reported
page will already contain the value specified by the poison, and the
reporting of the page should not change that value.

Acked-by: David Hildenbrand <david@redhat.com>
Signed-off-by: Alexander Duyck <alexander.h.duyck@linux.intel.com>
---
 hw/core/machine.c                  |    4 +++-
 hw/virtio/virtio-balloon.c         |   29 +++++++++++++++++++++++++++++
 include/hw/virtio/virtio-balloon.h |    1 +
 3 files changed, 33 insertions(+), 1 deletion(-)

diff --git a/hw/core/machine.c b/hw/core/machine.c
index bb3a7b18b193..9eca7d8c9bfe 100644
--- a/hw/core/machine.c
+++ b/hw/core/machine.c
@@ -28,7 +28,9 @@
 #include "hw/mem/nvdimm.h"
 #include "migration/vmstate.h"
 
-GlobalProperty hw_compat_5_0[] = {};
+GlobalProperty hw_compat_5_0[] = {
+    { "virtio-balloon-device", "page-poison", "false" },
+};
 const size_t hw_compat_5_0_len = G_N_ELEMENTS(hw_compat_5_0);
 
 GlobalProperty hw_compat_4_2[] = {
diff --git a/hw/virtio/virtio-balloon.c b/hw/virtio/virtio-balloon.c
index 065cd450f10f..26f6a7ca2e35 100644
--- a/hw/virtio/virtio-balloon.c
+++ b/hw/virtio/virtio-balloon.c
@@ -634,6 +634,7 @@ static void virtio_balloon_get_config(VirtIODevice *vdev, uint8_t *config_data)
 
     config.num_pages = cpu_to_le32(dev->num_pages);
     config.actual = cpu_to_le32(dev->actual);
+    config.poison_val = cpu_to_le32(dev->poison_val);
 
     if (dev->free_page_report_status == FREE_PAGE_REPORT_S_REQUESTED) {
         config.free_page_report_cmd_id =
@@ -683,6 +684,14 @@ static ram_addr_t get_current_ram_size(void)
     return size;
 }
 
+static bool virtio_balloon_page_poison_support(void *opaque)
+{
+    VirtIOBalloon *s = opaque;
+    VirtIODevice *vdev = VIRTIO_DEVICE(s);
+
+    return virtio_vdev_has_feature(vdev, VIRTIO_BALLOON_F_PAGE_POISON);
+}
+
 static void virtio_balloon_set_config(VirtIODevice *vdev,
                                       const uint8_t *config_data)
 {
@@ -697,6 +706,10 @@ static void virtio_balloon_set_config(VirtIODevice *vdev,
         qapi_event_send_balloon_change(vm_ram_size -
                         ((ram_addr_t) dev->actual << VIRTIO_BALLOON_PFN_SHIFT));
     }
+    dev->poison_val = 0;
+    if (virtio_balloon_page_poison_support(dev)) {
+        dev->poison_val = le32_to_cpu(config.poison_val);
+    }
     trace_virtio_balloon_set_config(dev->actual, oldactual);
 }
 
@@ -755,6 +768,17 @@ static const VMStateDescription vmstate_virtio_balloon_free_page_report = {
     }
 };
 
+static const VMStateDescription vmstate_virtio_balloon_page_poison = {
+    .name = "vitio-balloon-device/page-poison",
+    .version_id = 1,
+    .minimum_version_id = 1,
+    .needed = virtio_balloon_page_poison_support,
+    .fields = (VMStateField[]) {
+        VMSTATE_UINT32(poison_val, VirtIOBalloon),
+        VMSTATE_END_OF_LIST()
+    }
+};
+
 static const VMStateDescription vmstate_virtio_balloon_device = {
     .name = "virtio-balloon-device",
     .version_id = 1,
@@ -767,6 +791,7 @@ static const VMStateDescription vmstate_virtio_balloon_device = {
     },
     .subsections = (const VMStateDescription * []) {
         &vmstate_virtio_balloon_free_page_report,
+        &vmstate_virtio_balloon_page_poison,
         NULL
     }
 };
@@ -854,6 +879,8 @@ static void virtio_balloon_device_reset(VirtIODevice *vdev)
         g_free(s->stats_vq_elem);
         s->stats_vq_elem = NULL;
     }
+
+    s->poison_val = 0;
 }
 
 static void virtio_balloon_set_status(VirtIODevice *vdev, uint8_t status)
@@ -916,6 +943,8 @@ static Property virtio_balloon_properties[] = {
                     VIRTIO_BALLOON_F_DEFLATE_ON_OOM, false),
     DEFINE_PROP_BIT("free-page-hint", VirtIOBalloon, host_features,
                     VIRTIO_BALLOON_F_FREE_PAGE_HINT, false),
+    DEFINE_PROP_BIT("page-poison", VirtIOBalloon, host_features,
+                    VIRTIO_BALLOON_F_PAGE_POISON, true),
     /* QEMU 4.0 accidentally changed the config size even when free-page-hint
      * is disabled, resulting in QEMU 3.1 migration incompatibility.  This
      * property retains this quirk for QEMU 4.1 machine types.
diff --git a/include/hw/virtio/virtio-balloon.h b/include/hw/virtio/virtio-balloon.h
index d1c968d2376e..7fe78e5c14d7 100644
--- a/include/hw/virtio/virtio-balloon.h
+++ b/include/hw/virtio/virtio-balloon.h
@@ -70,6 +70,7 @@ typedef struct VirtIOBalloon {
     uint32_t host_features;
 
     bool qemu_4_0_config_size;
+    uint32_t poison_val;
 } VirtIOBalloon;
 
 #endif



^ permalink raw reply related	[flat|nested] 44+ messages in thread

* [virtio-dev] [PATCH v25 QEMU 1/3] virtio-balloon: Implement support for page poison reporting feature
@ 2020-05-27  4:14   ` Alexander Duyck
  0 siblings, 0 replies; 44+ messages in thread
From: Alexander Duyck @ 2020-05-27  4:14 UTC (permalink / raw)
  To: david, mst; +Cc: virtio-dev, qemu-devel

From: Alexander Duyck <alexander.h.duyck@linux.intel.com>

We need to make certain to advertise support for page poison reporting if
we want to actually get data on if the guest will be poisoning pages.

Add a value for reporting the poison value being used if page poisoning is
enabled in the guest. With this we can determine if we will need to skip
free page reporting when it is enabled in the future.

The value currently has no impact on existing balloon interfaces. In the
case of existing balloon interfaces the onus is on the guest driver to
reapply whatever poison is in place.

When we add free page reporting the poison value is used to determine if
we can perform in-place page reporting. The expectation is that a reported
page will already contain the value specified by the poison, and the
reporting of the page should not change that value.

Acked-by: David Hildenbrand <david@redhat.com>
Signed-off-by: Alexander Duyck <alexander.h.duyck@linux.intel.com>
---
 hw/core/machine.c                  |    4 +++-
 hw/virtio/virtio-balloon.c         |   29 +++++++++++++++++++++++++++++
 include/hw/virtio/virtio-balloon.h |    1 +
 3 files changed, 33 insertions(+), 1 deletion(-)

diff --git a/hw/core/machine.c b/hw/core/machine.c
index bb3a7b18b193..9eca7d8c9bfe 100644
--- a/hw/core/machine.c
+++ b/hw/core/machine.c
@@ -28,7 +28,9 @@
 #include "hw/mem/nvdimm.h"
 #include "migration/vmstate.h"
 
-GlobalProperty hw_compat_5_0[] = {};
+GlobalProperty hw_compat_5_0[] = {
+    { "virtio-balloon-device", "page-poison", "false" },
+};
 const size_t hw_compat_5_0_len = G_N_ELEMENTS(hw_compat_5_0);
 
 GlobalProperty hw_compat_4_2[] = {
diff --git a/hw/virtio/virtio-balloon.c b/hw/virtio/virtio-balloon.c
index 065cd450f10f..26f6a7ca2e35 100644
--- a/hw/virtio/virtio-balloon.c
+++ b/hw/virtio/virtio-balloon.c
@@ -634,6 +634,7 @@ static void virtio_balloon_get_config(VirtIODevice *vdev, uint8_t *config_data)
 
     config.num_pages = cpu_to_le32(dev->num_pages);
     config.actual = cpu_to_le32(dev->actual);
+    config.poison_val = cpu_to_le32(dev->poison_val);
 
     if (dev->free_page_report_status == FREE_PAGE_REPORT_S_REQUESTED) {
         config.free_page_report_cmd_id =
@@ -683,6 +684,14 @@ static ram_addr_t get_current_ram_size(void)
     return size;
 }
 
+static bool virtio_balloon_page_poison_support(void *opaque)
+{
+    VirtIOBalloon *s = opaque;
+    VirtIODevice *vdev = VIRTIO_DEVICE(s);
+
+    return virtio_vdev_has_feature(vdev, VIRTIO_BALLOON_F_PAGE_POISON);
+}
+
 static void virtio_balloon_set_config(VirtIODevice *vdev,
                                       const uint8_t *config_data)
 {
@@ -697,6 +706,10 @@ static void virtio_balloon_set_config(VirtIODevice *vdev,
         qapi_event_send_balloon_change(vm_ram_size -
                         ((ram_addr_t) dev->actual << VIRTIO_BALLOON_PFN_SHIFT));
     }
+    dev->poison_val = 0;
+    if (virtio_balloon_page_poison_support(dev)) {
+        dev->poison_val = le32_to_cpu(config.poison_val);
+    }
     trace_virtio_balloon_set_config(dev->actual, oldactual);
 }
 
@@ -755,6 +768,17 @@ static const VMStateDescription vmstate_virtio_balloon_free_page_report = {
     }
 };
 
+static const VMStateDescription vmstate_virtio_balloon_page_poison = {
+    .name = "vitio-balloon-device/page-poison",
+    .version_id = 1,
+    .minimum_version_id = 1,
+    .needed = virtio_balloon_page_poison_support,
+    .fields = (VMStateField[]) {
+        VMSTATE_UINT32(poison_val, VirtIOBalloon),
+        VMSTATE_END_OF_LIST()
+    }
+};
+
 static const VMStateDescription vmstate_virtio_balloon_device = {
     .name = "virtio-balloon-device",
     .version_id = 1,
@@ -767,6 +791,7 @@ static const VMStateDescription vmstate_virtio_balloon_device = {
     },
     .subsections = (const VMStateDescription * []) {
         &vmstate_virtio_balloon_free_page_report,
+        &vmstate_virtio_balloon_page_poison,
         NULL
     }
 };
@@ -854,6 +879,8 @@ static void virtio_balloon_device_reset(VirtIODevice *vdev)
         g_free(s->stats_vq_elem);
         s->stats_vq_elem = NULL;
     }
+
+    s->poison_val = 0;
 }
 
 static void virtio_balloon_set_status(VirtIODevice *vdev, uint8_t status)
@@ -916,6 +943,8 @@ static Property virtio_balloon_properties[] = {
                     VIRTIO_BALLOON_F_DEFLATE_ON_OOM, false),
     DEFINE_PROP_BIT("free-page-hint", VirtIOBalloon, host_features,
                     VIRTIO_BALLOON_F_FREE_PAGE_HINT, false),
+    DEFINE_PROP_BIT("page-poison", VirtIOBalloon, host_features,
+                    VIRTIO_BALLOON_F_PAGE_POISON, true),
     /* QEMU 4.0 accidentally changed the config size even when free-page-hint
      * is disabled, resulting in QEMU 3.1 migration incompatibility.  This
      * property retains this quirk for QEMU 4.1 machine types.
diff --git a/include/hw/virtio/virtio-balloon.h b/include/hw/virtio/virtio-balloon.h
index d1c968d2376e..7fe78e5c14d7 100644
--- a/include/hw/virtio/virtio-balloon.h
+++ b/include/hw/virtio/virtio-balloon.h
@@ -70,6 +70,7 @@ typedef struct VirtIOBalloon {
     uint32_t host_features;
 
     bool qemu_4_0_config_size;
+    uint32_t poison_val;
 } VirtIOBalloon;
 
 #endif


---------------------------------------------------------------------
To unsubscribe, e-mail: virtio-dev-unsubscribe@lists.oasis-open.org
For additional commands, e-mail: virtio-dev-help@lists.oasis-open.org


^ permalink raw reply related	[flat|nested] 44+ messages in thread

* [PATCH v25 QEMU 2/3] virtio-balloon: Provide an interface for free page reporting
  2020-05-27  4:13 ` [virtio-dev] " Alexander Duyck
@ 2020-05-27  4:14   ` Alexander Duyck
  -1 siblings, 0 replies; 44+ messages in thread
From: Alexander Duyck @ 2020-05-27  4:14 UTC (permalink / raw)
  To: david, mst; +Cc: virtio-dev, qemu-devel

From: Alexander Duyck <alexander.h.duyck@linux.intel.com>

Add support for free page reporting. The idea is to function very similar
to how the balloon works in that we basically end up madvising the page as
not being used. However we don't really need to bother with any deflate
type logic since the page will be faulted back into the guest when it is
read or written to.

This provides a new way of letting the guest proactively report free
pages to the hypervisor, so the hypervisor can reuse them. In contrast to
inflate/deflate that is triggered via the hypervisor explicitly.

Acked-by: David Hildenbrand <david@redhat.com>
Signed-off-by: Alexander Duyck <alexander.h.duyck@linux.intel.com>
---
 hw/virtio/virtio-balloon.c         |   72 ++++++++++++++++++++++++++++++++++++
 include/hw/virtio/virtio-balloon.h |    2 +
 2 files changed, 73 insertions(+), 1 deletion(-)

diff --git a/hw/virtio/virtio-balloon.c b/hw/virtio/virtio-balloon.c
index 26f6a7ca2e35..3e2ac1104b5f 100644
--- a/hw/virtio/virtio-balloon.c
+++ b/hw/virtio/virtio-balloon.c
@@ -321,6 +321,67 @@ static void balloon_stats_set_poll_interval(Object *obj, Visitor *v,
     balloon_stats_change_timer(s, 0);
 }
 
+static void virtio_balloon_handle_report(VirtIODevice *vdev, VirtQueue *vq)
+{
+    VirtIOBalloon *dev = VIRTIO_BALLOON(vdev);
+    VirtQueueElement *elem;
+
+    while ((elem = virtqueue_pop(vq, sizeof(VirtQueueElement)))) {
+        unsigned int i;
+
+        /*
+         * When we discard the page it has the effect of removing the page
+         * from the hypervisor itself and causing it to be zeroed when it
+         * is returned to us. So we must not discard the page if it is
+         * accessible by another device or process, or if the guest is
+         * expecting it to retain a non-zero value.
+         */
+        if (qemu_balloon_is_inhibited() || dev->poison_val) {
+            goto skip_element;
+        }
+
+        for (i = 0; i < elem->in_num; i++) {
+            void *addr = elem->in_sg[i].iov_base;
+            size_t size = elem->in_sg[i].iov_len;
+            ram_addr_t ram_offset;
+            RAMBlock *rb;
+
+            /*
+             * There is no need to check the memory section to see if
+             * it is ram/readonly/romd like there is for handle_output
+             * below. If the region is not meant to be written to then
+             * address_space_map will have allocated a bounce buffer
+             * and it will be freed in address_space_unmap and trigger
+             * and unassigned_mem_write before failing to copy over the
+             * buffer. If more than one bad descriptor is provided it
+             * will return NULL after the first bounce buffer and fail
+             * to map any resources.
+             */
+            rb = qemu_ram_block_from_host(addr, false, &ram_offset);
+            if (!rb) {
+                trace_virtio_balloon_bad_addr(elem->in_addr[i]);
+                continue;
+            }
+
+            /*
+             * For now we will simply ignore unaligned memory regions, or
+             * regions that overrun the end of the RAMBlock.
+             */
+            if (!QEMU_IS_ALIGNED(ram_offset | size, qemu_ram_pagesize(rb)) ||
+                (ram_offset + size) > qemu_ram_get_used_length(rb)) {
+                continue;
+            }
+
+            ram_block_discard_range(rb, ram_offset, size);
+        }
+
+skip_element:
+        virtqueue_push(vq, elem, 0);
+        virtio_notify(vdev, vq);
+        g_free(elem);
+    }
+}
+
 static void virtio_balloon_handle_output(VirtIODevice *vdev, VirtQueue *vq)
 {
     VirtIOBalloon *s = VIRTIO_BALLOON(vdev);
@@ -841,6 +902,12 @@ static void virtio_balloon_device_realize(DeviceState *dev, Error **errp)
             virtio_error(vdev, "iothread is missing");
         }
     }
+
+    if (virtio_has_feature(s->host_features, VIRTIO_BALLOON_F_REPORTING)) {
+        s->reporting_vq = virtio_add_queue(vdev, 32,
+                                           virtio_balloon_handle_report);
+    }
+
     reset_stats(s);
 }
 
@@ -863,6 +930,9 @@ static void virtio_balloon_device_unrealize(DeviceState *dev)
     if (s->free_page_vq) {
         virtio_delete_queue(s->free_page_vq);
     }
+    if (s->reporting_vq) {
+        virtio_delete_queue(s->reporting_vq);
+    }
     virtio_cleanup(vdev);
 }
 
@@ -945,6 +1015,8 @@ static Property virtio_balloon_properties[] = {
                     VIRTIO_BALLOON_F_FREE_PAGE_HINT, false),
     DEFINE_PROP_BIT("page-poison", VirtIOBalloon, host_features,
                     VIRTIO_BALLOON_F_PAGE_POISON, true),
+    DEFINE_PROP_BIT("free-page-reporting", VirtIOBalloon, host_features,
+                    VIRTIO_BALLOON_F_REPORTING, false),
     /* QEMU 4.0 accidentally changed the config size even when free-page-hint
      * is disabled, resulting in QEMU 3.1 migration incompatibility.  This
      * property retains this quirk for QEMU 4.1 machine types.
diff --git a/include/hw/virtio/virtio-balloon.h b/include/hw/virtio/virtio-balloon.h
index 7fe78e5c14d7..d49fef00cef2 100644
--- a/include/hw/virtio/virtio-balloon.h
+++ b/include/hw/virtio/virtio-balloon.h
@@ -42,7 +42,7 @@ enum virtio_balloon_free_page_report_status {
 
 typedef struct VirtIOBalloon {
     VirtIODevice parent_obj;
-    VirtQueue *ivq, *dvq, *svq, *free_page_vq;
+    VirtQueue *ivq, *dvq, *svq, *free_page_vq, *reporting_vq;
     uint32_t free_page_report_status;
     uint32_t num_pages;
     uint32_t actual;



^ permalink raw reply related	[flat|nested] 44+ messages in thread

* [virtio-dev] [PATCH v25 QEMU 2/3] virtio-balloon: Provide an interface for free page reporting
@ 2020-05-27  4:14   ` Alexander Duyck
  0 siblings, 0 replies; 44+ messages in thread
From: Alexander Duyck @ 2020-05-27  4:14 UTC (permalink / raw)
  To: david, mst; +Cc: virtio-dev, qemu-devel

From: Alexander Duyck <alexander.h.duyck@linux.intel.com>

Add support for free page reporting. The idea is to function very similar
to how the balloon works in that we basically end up madvising the page as
not being used. However we don't really need to bother with any deflate
type logic since the page will be faulted back into the guest when it is
read or written to.

This provides a new way of letting the guest proactively report free
pages to the hypervisor, so the hypervisor can reuse them. In contrast to
inflate/deflate that is triggered via the hypervisor explicitly.

Acked-by: David Hildenbrand <david@redhat.com>
Signed-off-by: Alexander Duyck <alexander.h.duyck@linux.intel.com>
---
 hw/virtio/virtio-balloon.c         |   72 ++++++++++++++++++++++++++++++++++++
 include/hw/virtio/virtio-balloon.h |    2 +
 2 files changed, 73 insertions(+), 1 deletion(-)

diff --git a/hw/virtio/virtio-balloon.c b/hw/virtio/virtio-balloon.c
index 26f6a7ca2e35..3e2ac1104b5f 100644
--- a/hw/virtio/virtio-balloon.c
+++ b/hw/virtio/virtio-balloon.c
@@ -321,6 +321,67 @@ static void balloon_stats_set_poll_interval(Object *obj, Visitor *v,
     balloon_stats_change_timer(s, 0);
 }
 
+static void virtio_balloon_handle_report(VirtIODevice *vdev, VirtQueue *vq)
+{
+    VirtIOBalloon *dev = VIRTIO_BALLOON(vdev);
+    VirtQueueElement *elem;
+
+    while ((elem = virtqueue_pop(vq, sizeof(VirtQueueElement)))) {
+        unsigned int i;
+
+        /*
+         * When we discard the page it has the effect of removing the page
+         * from the hypervisor itself and causing it to be zeroed when it
+         * is returned to us. So we must not discard the page if it is
+         * accessible by another device or process, or if the guest is
+         * expecting it to retain a non-zero value.
+         */
+        if (qemu_balloon_is_inhibited() || dev->poison_val) {
+            goto skip_element;
+        }
+
+        for (i = 0; i < elem->in_num; i++) {
+            void *addr = elem->in_sg[i].iov_base;
+            size_t size = elem->in_sg[i].iov_len;
+            ram_addr_t ram_offset;
+            RAMBlock *rb;
+
+            /*
+             * There is no need to check the memory section to see if
+             * it is ram/readonly/romd like there is for handle_output
+             * below. If the region is not meant to be written to then
+             * address_space_map will have allocated a bounce buffer
+             * and it will be freed in address_space_unmap and trigger
+             * and unassigned_mem_write before failing to copy over the
+             * buffer. If more than one bad descriptor is provided it
+             * will return NULL after the first bounce buffer and fail
+             * to map any resources.
+             */
+            rb = qemu_ram_block_from_host(addr, false, &ram_offset);
+            if (!rb) {
+                trace_virtio_balloon_bad_addr(elem->in_addr[i]);
+                continue;
+            }
+
+            /*
+             * For now we will simply ignore unaligned memory regions, or
+             * regions that overrun the end of the RAMBlock.
+             */
+            if (!QEMU_IS_ALIGNED(ram_offset | size, qemu_ram_pagesize(rb)) ||
+                (ram_offset + size) > qemu_ram_get_used_length(rb)) {
+                continue;
+            }
+
+            ram_block_discard_range(rb, ram_offset, size);
+        }
+
+skip_element:
+        virtqueue_push(vq, elem, 0);
+        virtio_notify(vdev, vq);
+        g_free(elem);
+    }
+}
+
 static void virtio_balloon_handle_output(VirtIODevice *vdev, VirtQueue *vq)
 {
     VirtIOBalloon *s = VIRTIO_BALLOON(vdev);
@@ -841,6 +902,12 @@ static void virtio_balloon_device_realize(DeviceState *dev, Error **errp)
             virtio_error(vdev, "iothread is missing");
         }
     }
+
+    if (virtio_has_feature(s->host_features, VIRTIO_BALLOON_F_REPORTING)) {
+        s->reporting_vq = virtio_add_queue(vdev, 32,
+                                           virtio_balloon_handle_report);
+    }
+
     reset_stats(s);
 }
 
@@ -863,6 +930,9 @@ static void virtio_balloon_device_unrealize(DeviceState *dev)
     if (s->free_page_vq) {
         virtio_delete_queue(s->free_page_vq);
     }
+    if (s->reporting_vq) {
+        virtio_delete_queue(s->reporting_vq);
+    }
     virtio_cleanup(vdev);
 }
 
@@ -945,6 +1015,8 @@ static Property virtio_balloon_properties[] = {
                     VIRTIO_BALLOON_F_FREE_PAGE_HINT, false),
     DEFINE_PROP_BIT("page-poison", VirtIOBalloon, host_features,
                     VIRTIO_BALLOON_F_PAGE_POISON, true),
+    DEFINE_PROP_BIT("free-page-reporting", VirtIOBalloon, host_features,
+                    VIRTIO_BALLOON_F_REPORTING, false),
     /* QEMU 4.0 accidentally changed the config size even when free-page-hint
      * is disabled, resulting in QEMU 3.1 migration incompatibility.  This
      * property retains this quirk for QEMU 4.1 machine types.
diff --git a/include/hw/virtio/virtio-balloon.h b/include/hw/virtio/virtio-balloon.h
index 7fe78e5c14d7..d49fef00cef2 100644
--- a/include/hw/virtio/virtio-balloon.h
+++ b/include/hw/virtio/virtio-balloon.h
@@ -42,7 +42,7 @@ enum virtio_balloon_free_page_report_status {
 
 typedef struct VirtIOBalloon {
     VirtIODevice parent_obj;
-    VirtQueue *ivq, *dvq, *svq, *free_page_vq;
+    VirtQueue *ivq, *dvq, *svq, *free_page_vq, *reporting_vq;
     uint32_t free_page_report_status;
     uint32_t num_pages;
     uint32_t actual;


---------------------------------------------------------------------
To unsubscribe, e-mail: virtio-dev-unsubscribe@lists.oasis-open.org
For additional commands, e-mail: virtio-dev-help@lists.oasis-open.org


^ permalink raw reply related	[flat|nested] 44+ messages in thread

* [PATCH v25 QEMU 3/3] virtio-balloon: Replace free page hinting references to 'report' with 'hint'
  2020-05-27  4:13 ` [virtio-dev] " Alexander Duyck
@ 2020-05-27  4:14   ` Alexander Duyck
  -1 siblings, 0 replies; 44+ messages in thread
From: Alexander Duyck @ 2020-05-27  4:14 UTC (permalink / raw)
  To: david, mst; +Cc: virtio-dev, qemu-devel

From: Alexander Duyck <alexander.h.duyck@linux.intel.com>

In an upcoming patch a feature named Free Page Reporting is about to be
added. In order to avoid any confusion we should drop the use of the word
'report' when referring to Free Page Hinting. So what this patch does is go
through and replace all instances of 'report' with 'hint" when we are
referring to free page hinting.

Acked-by: David Hildenbrand <david@redhat.com>
Signed-off-by: Alexander Duyck <alexander.h.duyck@linux.intel.com>
---
 hw/virtio/virtio-balloon.c         |   78 ++++++++++++++++++------------------
 include/hw/virtio/virtio-balloon.h |   20 +++++----
 2 files changed, 49 insertions(+), 49 deletions(-)

diff --git a/hw/virtio/virtio-balloon.c b/hw/virtio/virtio-balloon.c
index 3e2ac1104b5f..dc15409b0bb6 100644
--- a/hw/virtio/virtio-balloon.c
+++ b/hw/virtio/virtio-balloon.c
@@ -527,21 +527,21 @@ static bool get_free_page_hints(VirtIOBalloon *dev)
             ret = false;
             goto out;
         }
-        if (id == dev->free_page_report_cmd_id) {
-            dev->free_page_report_status = FREE_PAGE_REPORT_S_START;
+        if (id == dev->free_page_hint_cmd_id) {
+            dev->free_page_hint_status = FREE_PAGE_HINT_S_START;
         } else {
             /*
              * Stop the optimization only when it has started. This
              * avoids a stale stop sign for the previous command.
              */
-            if (dev->free_page_report_status == FREE_PAGE_REPORT_S_START) {
-                dev->free_page_report_status = FREE_PAGE_REPORT_S_STOP;
+            if (dev->free_page_hint_status == FREE_PAGE_HINT_S_START) {
+                dev->free_page_hint_status = FREE_PAGE_HINT_S_STOP;
             }
         }
     }
 
     if (elem->in_num) {
-        if (dev->free_page_report_status == FREE_PAGE_REPORT_S_START) {
+        if (dev->free_page_hint_status == FREE_PAGE_HINT_S_START) {
             qemu_guest_free_page_hint(elem->in_sg[0].iov_base,
                                       elem->in_sg[0].iov_len);
         }
@@ -567,11 +567,11 @@ static void virtio_ballloon_get_free_page_hints(void *opaque)
         qemu_mutex_unlock(&dev->free_page_lock);
         virtio_notify(vdev, vq);
       /*
-       * Start to poll the vq once the reporting started. Otherwise, continue
+       * Start to poll the vq once the hinting started. Otherwise, continue
        * only when there are entries on the vq, which need to be given back.
        */
     } while (continue_to_get_hints ||
-             dev->free_page_report_status == FREE_PAGE_REPORT_S_START);
+             dev->free_page_hint_status == FREE_PAGE_HINT_S_START);
     virtio_queue_set_notification(vq, 1);
 }
 
@@ -592,14 +592,14 @@ static void virtio_balloon_free_page_start(VirtIOBalloon *s)
         return;
     }
 
-    if (s->free_page_report_cmd_id == UINT_MAX) {
-        s->free_page_report_cmd_id =
-                       VIRTIO_BALLOON_FREE_PAGE_REPORT_CMD_ID_MIN;
+    if (s->free_page_hint_cmd_id == UINT_MAX) {
+        s->free_page_hint_cmd_id =
+                       VIRTIO_BALLOON_FREE_PAGE_HINT_CMD_ID_MIN;
     } else {
-        s->free_page_report_cmd_id++;
+        s->free_page_hint_cmd_id++;
     }
 
-    s->free_page_report_status = FREE_PAGE_REPORT_S_REQUESTED;
+    s->free_page_hint_status = FREE_PAGE_HINT_S_REQUESTED;
     virtio_notify_config(vdev);
 }
 
@@ -607,18 +607,18 @@ static void virtio_balloon_free_page_stop(VirtIOBalloon *s)
 {
     VirtIODevice *vdev = VIRTIO_DEVICE(s);
 
-    if (s->free_page_report_status != FREE_PAGE_REPORT_S_STOP) {
+    if (s->free_page_hint_status != FREE_PAGE_HINT_S_STOP) {
         /*
          * The lock also guarantees us that the
          * virtio_ballloon_get_free_page_hints exits after the
-         * free_page_report_status is set to S_STOP.
+         * free_page_hint_status is set to S_STOP.
          */
         qemu_mutex_lock(&s->free_page_lock);
         /*
-         * The guest hasn't done the reporting, so host sends a notification
-         * to the guest to actively stop the reporting.
+         * The guest isn't done hinting, so send a notification
+         * to the guest to actively stop the hinting.
          */
-        s->free_page_report_status = FREE_PAGE_REPORT_S_STOP;
+        s->free_page_hint_status = FREE_PAGE_HINT_S_STOP;
         qemu_mutex_unlock(&s->free_page_lock);
         virtio_notify_config(vdev);
     }
@@ -628,15 +628,15 @@ static void virtio_balloon_free_page_done(VirtIOBalloon *s)
 {
     VirtIODevice *vdev = VIRTIO_DEVICE(s);
 
-    s->free_page_report_status = FREE_PAGE_REPORT_S_DONE;
+    s->free_page_hint_status = FREE_PAGE_HINT_S_DONE;
     virtio_notify_config(vdev);
 }
 
 static int
-virtio_balloon_free_page_report_notify(NotifierWithReturn *n, void *data)
+virtio_balloon_free_page_hint_notify(NotifierWithReturn *n, void *data)
 {
     VirtIOBalloon *dev = container_of(n, VirtIOBalloon,
-                                      free_page_report_notify);
+                                      free_page_hint_notify);
     VirtIODevice *vdev = VIRTIO_DEVICE(dev);
     PrecopyNotifyData *pnd = data;
 
@@ -685,7 +685,7 @@ static size_t virtio_balloon_config_size(VirtIOBalloon *s)
     if (virtio_has_feature(features, VIRTIO_BALLOON_F_FREE_PAGE_HINT)) {
         return offsetof(struct virtio_balloon_config, poison_val);
     }
-    return offsetof(struct virtio_balloon_config, free_page_report_cmd_id);
+    return offsetof(struct virtio_balloon_config, free_page_hint_cmd_id);
 }
 
 static void virtio_balloon_get_config(VirtIODevice *vdev, uint8_t *config_data)
@@ -697,14 +697,14 @@ static void virtio_balloon_get_config(VirtIODevice *vdev, uint8_t *config_data)
     config.actual = cpu_to_le32(dev->actual);
     config.poison_val = cpu_to_le32(dev->poison_val);
 
-    if (dev->free_page_report_status == FREE_PAGE_REPORT_S_REQUESTED) {
-        config.free_page_report_cmd_id =
-                       cpu_to_le32(dev->free_page_report_cmd_id);
-    } else if (dev->free_page_report_status == FREE_PAGE_REPORT_S_STOP) {
-        config.free_page_report_cmd_id =
+    if (dev->free_page_hint_status == FREE_PAGE_HINT_S_REQUESTED) {
+        config.free_page_hint_cmd_id =
+                       cpu_to_le32(dev->free_page_hint_cmd_id);
+    } else if (dev->free_page_hint_status == FREE_PAGE_HINT_S_STOP) {
+        config.free_page_hint_cmd_id =
                        cpu_to_le32(VIRTIO_BALLOON_CMD_ID_STOP);
-    } else if (dev->free_page_report_status == FREE_PAGE_REPORT_S_DONE) {
-        config.free_page_report_cmd_id =
+    } else if (dev->free_page_hint_status == FREE_PAGE_HINT_S_DONE) {
+        config.free_page_hint_cmd_id =
                        cpu_to_le32(VIRTIO_BALLOON_CMD_ID_DONE);
     }
 
@@ -817,14 +817,14 @@ static int virtio_balloon_post_load_device(void *opaque, int version_id)
     return 0;
 }
 
-static const VMStateDescription vmstate_virtio_balloon_free_page_report = {
+static const VMStateDescription vmstate_virtio_balloon_free_page_hint = {
     .name = "virtio-balloon-device/free-page-report",
     .version_id = 1,
     .minimum_version_id = 1,
     .needed = virtio_balloon_free_page_support,
     .fields = (VMStateField[]) {
-        VMSTATE_UINT32(free_page_report_cmd_id, VirtIOBalloon),
-        VMSTATE_UINT32(free_page_report_status, VirtIOBalloon),
+        VMSTATE_UINT32(free_page_hint_cmd_id, VirtIOBalloon),
+        VMSTATE_UINT32(free_page_hint_status, VirtIOBalloon),
         VMSTATE_END_OF_LIST()
     }
 };
@@ -851,7 +851,7 @@ static const VMStateDescription vmstate_virtio_balloon_device = {
         VMSTATE_END_OF_LIST()
     },
     .subsections = (const VMStateDescription * []) {
-        &vmstate_virtio_balloon_free_page_report,
+        &vmstate_virtio_balloon_free_page_hint,
         &vmstate_virtio_balloon_page_poison,
         NULL
     }
@@ -883,12 +883,12 @@ static void virtio_balloon_device_realize(DeviceState *dev, Error **errp)
                            VIRTIO_BALLOON_F_FREE_PAGE_HINT)) {
         s->free_page_vq = virtio_add_queue(vdev, VIRTQUEUE_MAX_SIZE,
                                            virtio_balloon_handle_free_page_vq);
-        s->free_page_report_status = FREE_PAGE_REPORT_S_STOP;
-        s->free_page_report_cmd_id =
-                           VIRTIO_BALLOON_FREE_PAGE_REPORT_CMD_ID_MIN;
-        s->free_page_report_notify.notify =
-                                       virtio_balloon_free_page_report_notify;
-        precopy_add_notifier(&s->free_page_report_notify);
+        s->free_page_hint_status = FREE_PAGE_HINT_S_STOP;
+        s->free_page_hint_cmd_id =
+                           VIRTIO_BALLOON_FREE_PAGE_HINT_CMD_ID_MIN;
+        s->free_page_hint_notify.notify =
+                                       virtio_balloon_free_page_hint_notify;
+        precopy_add_notifier(&s->free_page_hint_notify);
         if (s->iothread) {
             object_ref(OBJECT(s->iothread));
             s->free_page_bh = aio_bh_new(iothread_get_aio_context(s->iothread),
@@ -919,7 +919,7 @@ static void virtio_balloon_device_unrealize(DeviceState *dev)
     if (virtio_balloon_free_page_support(s)) {
         qemu_bh_delete(s->free_page_bh);
         virtio_balloon_free_page_stop(s);
-        precopy_remove_notifier(&s->free_page_report_notify);
+        precopy_remove_notifier(&s->free_page_hint_notify);
     }
     balloon_stats_destroy_timer(s);
     qemu_remove_balloon_handler(s);
diff --git a/include/hw/virtio/virtio-balloon.h b/include/hw/virtio/virtio-balloon.h
index d49fef00cef2..28fd2b396087 100644
--- a/include/hw/virtio/virtio-balloon.h
+++ b/include/hw/virtio/virtio-balloon.h
@@ -23,7 +23,7 @@
 #define VIRTIO_BALLOON(obj) \
         OBJECT_CHECK(VirtIOBalloon, (obj), TYPE_VIRTIO_BALLOON)
 
-#define VIRTIO_BALLOON_FREE_PAGE_REPORT_CMD_ID_MIN 0x80000000
+#define VIRTIO_BALLOON_FREE_PAGE_HINT_CMD_ID_MIN 0x80000000
 
 typedef struct virtio_balloon_stat VirtIOBalloonStat;
 
@@ -33,20 +33,20 @@ typedef struct virtio_balloon_stat_modern {
        uint64_t val;
 } VirtIOBalloonStatModern;
 
-enum virtio_balloon_free_page_report_status {
-    FREE_PAGE_REPORT_S_STOP = 0,
-    FREE_PAGE_REPORT_S_REQUESTED = 1,
-    FREE_PAGE_REPORT_S_START = 2,
-    FREE_PAGE_REPORT_S_DONE = 3,
+enum virtio_balloon_free_page_hint_status {
+    FREE_PAGE_HINT_S_STOP = 0,
+    FREE_PAGE_HINT_S_REQUESTED = 1,
+    FREE_PAGE_HINT_S_START = 2,
+    FREE_PAGE_HINT_S_DONE = 3,
 };
 
 typedef struct VirtIOBalloon {
     VirtIODevice parent_obj;
     VirtQueue *ivq, *dvq, *svq, *free_page_vq, *reporting_vq;
-    uint32_t free_page_report_status;
+    uint32_t free_page_hint_status;
     uint32_t num_pages;
     uint32_t actual;
-    uint32_t free_page_report_cmd_id;
+    uint32_t free_page_hint_cmd_id;
     uint64_t stats[VIRTIO_BALLOON_S_NR];
     VirtQueueElement *stats_vq_elem;
     size_t stats_vq_offset;
@@ -55,7 +55,7 @@ typedef struct VirtIOBalloon {
     QEMUBH *free_page_bh;
     /*
      * Lock to synchronize threads to access the free page reporting related
-     * fields (e.g. free_page_report_status).
+     * fields (e.g. free_page_hint_status).
      */
     QemuMutex free_page_lock;
     QemuCond  free_page_cond;
@@ -64,7 +64,7 @@ typedef struct VirtIOBalloon {
      * stopped.
      */
     bool block_iothread;
-    NotifierWithReturn free_page_report_notify;
+    NotifierWithReturn free_page_hint_notify;
     int64_t stats_last_update;
     int64_t stats_poll_interval;
     uint32_t host_features;



^ permalink raw reply related	[flat|nested] 44+ messages in thread

* [virtio-dev] [PATCH v25 QEMU 3/3] virtio-balloon: Replace free page hinting references to 'report' with 'hint'
@ 2020-05-27  4:14   ` Alexander Duyck
  0 siblings, 0 replies; 44+ messages in thread
From: Alexander Duyck @ 2020-05-27  4:14 UTC (permalink / raw)
  To: david, mst; +Cc: virtio-dev, qemu-devel

From: Alexander Duyck <alexander.h.duyck@linux.intel.com>

In an upcoming patch a feature named Free Page Reporting is about to be
added. In order to avoid any confusion we should drop the use of the word
'report' when referring to Free Page Hinting. So what this patch does is go
through and replace all instances of 'report' with 'hint" when we are
referring to free page hinting.

Acked-by: David Hildenbrand <david@redhat.com>
Signed-off-by: Alexander Duyck <alexander.h.duyck@linux.intel.com>
---
 hw/virtio/virtio-balloon.c         |   78 ++++++++++++++++++------------------
 include/hw/virtio/virtio-balloon.h |   20 +++++----
 2 files changed, 49 insertions(+), 49 deletions(-)

diff --git a/hw/virtio/virtio-balloon.c b/hw/virtio/virtio-balloon.c
index 3e2ac1104b5f..dc15409b0bb6 100644
--- a/hw/virtio/virtio-balloon.c
+++ b/hw/virtio/virtio-balloon.c
@@ -527,21 +527,21 @@ static bool get_free_page_hints(VirtIOBalloon *dev)
             ret = false;
             goto out;
         }
-        if (id == dev->free_page_report_cmd_id) {
-            dev->free_page_report_status = FREE_PAGE_REPORT_S_START;
+        if (id == dev->free_page_hint_cmd_id) {
+            dev->free_page_hint_status = FREE_PAGE_HINT_S_START;
         } else {
             /*
              * Stop the optimization only when it has started. This
              * avoids a stale stop sign for the previous command.
              */
-            if (dev->free_page_report_status == FREE_PAGE_REPORT_S_START) {
-                dev->free_page_report_status = FREE_PAGE_REPORT_S_STOP;
+            if (dev->free_page_hint_status == FREE_PAGE_HINT_S_START) {
+                dev->free_page_hint_status = FREE_PAGE_HINT_S_STOP;
             }
         }
     }
 
     if (elem->in_num) {
-        if (dev->free_page_report_status == FREE_PAGE_REPORT_S_START) {
+        if (dev->free_page_hint_status == FREE_PAGE_HINT_S_START) {
             qemu_guest_free_page_hint(elem->in_sg[0].iov_base,
                                       elem->in_sg[0].iov_len);
         }
@@ -567,11 +567,11 @@ static void virtio_ballloon_get_free_page_hints(void *opaque)
         qemu_mutex_unlock(&dev->free_page_lock);
         virtio_notify(vdev, vq);
       /*
-       * Start to poll the vq once the reporting started. Otherwise, continue
+       * Start to poll the vq once the hinting started. Otherwise, continue
        * only when there are entries on the vq, which need to be given back.
        */
     } while (continue_to_get_hints ||
-             dev->free_page_report_status == FREE_PAGE_REPORT_S_START);
+             dev->free_page_hint_status == FREE_PAGE_HINT_S_START);
     virtio_queue_set_notification(vq, 1);
 }
 
@@ -592,14 +592,14 @@ static void virtio_balloon_free_page_start(VirtIOBalloon *s)
         return;
     }
 
-    if (s->free_page_report_cmd_id == UINT_MAX) {
-        s->free_page_report_cmd_id =
-                       VIRTIO_BALLOON_FREE_PAGE_REPORT_CMD_ID_MIN;
+    if (s->free_page_hint_cmd_id == UINT_MAX) {
+        s->free_page_hint_cmd_id =
+                       VIRTIO_BALLOON_FREE_PAGE_HINT_CMD_ID_MIN;
     } else {
-        s->free_page_report_cmd_id++;
+        s->free_page_hint_cmd_id++;
     }
 
-    s->free_page_report_status = FREE_PAGE_REPORT_S_REQUESTED;
+    s->free_page_hint_status = FREE_PAGE_HINT_S_REQUESTED;
     virtio_notify_config(vdev);
 }
 
@@ -607,18 +607,18 @@ static void virtio_balloon_free_page_stop(VirtIOBalloon *s)
 {
     VirtIODevice *vdev = VIRTIO_DEVICE(s);
 
-    if (s->free_page_report_status != FREE_PAGE_REPORT_S_STOP) {
+    if (s->free_page_hint_status != FREE_PAGE_HINT_S_STOP) {
         /*
          * The lock also guarantees us that the
          * virtio_ballloon_get_free_page_hints exits after the
-         * free_page_report_status is set to S_STOP.
+         * free_page_hint_status is set to S_STOP.
          */
         qemu_mutex_lock(&s->free_page_lock);
         /*
-         * The guest hasn't done the reporting, so host sends a notification
-         * to the guest to actively stop the reporting.
+         * The guest isn't done hinting, so send a notification
+         * to the guest to actively stop the hinting.
          */
-        s->free_page_report_status = FREE_PAGE_REPORT_S_STOP;
+        s->free_page_hint_status = FREE_PAGE_HINT_S_STOP;
         qemu_mutex_unlock(&s->free_page_lock);
         virtio_notify_config(vdev);
     }
@@ -628,15 +628,15 @@ static void virtio_balloon_free_page_done(VirtIOBalloon *s)
 {
     VirtIODevice *vdev = VIRTIO_DEVICE(s);
 
-    s->free_page_report_status = FREE_PAGE_REPORT_S_DONE;
+    s->free_page_hint_status = FREE_PAGE_HINT_S_DONE;
     virtio_notify_config(vdev);
 }
 
 static int
-virtio_balloon_free_page_report_notify(NotifierWithReturn *n, void *data)
+virtio_balloon_free_page_hint_notify(NotifierWithReturn *n, void *data)
 {
     VirtIOBalloon *dev = container_of(n, VirtIOBalloon,
-                                      free_page_report_notify);
+                                      free_page_hint_notify);
     VirtIODevice *vdev = VIRTIO_DEVICE(dev);
     PrecopyNotifyData *pnd = data;
 
@@ -685,7 +685,7 @@ static size_t virtio_balloon_config_size(VirtIOBalloon *s)
     if (virtio_has_feature(features, VIRTIO_BALLOON_F_FREE_PAGE_HINT)) {
         return offsetof(struct virtio_balloon_config, poison_val);
     }
-    return offsetof(struct virtio_balloon_config, free_page_report_cmd_id);
+    return offsetof(struct virtio_balloon_config, free_page_hint_cmd_id);
 }
 
 static void virtio_balloon_get_config(VirtIODevice *vdev, uint8_t *config_data)
@@ -697,14 +697,14 @@ static void virtio_balloon_get_config(VirtIODevice *vdev, uint8_t *config_data)
     config.actual = cpu_to_le32(dev->actual);
     config.poison_val = cpu_to_le32(dev->poison_val);
 
-    if (dev->free_page_report_status == FREE_PAGE_REPORT_S_REQUESTED) {
-        config.free_page_report_cmd_id =
-                       cpu_to_le32(dev->free_page_report_cmd_id);
-    } else if (dev->free_page_report_status == FREE_PAGE_REPORT_S_STOP) {
-        config.free_page_report_cmd_id =
+    if (dev->free_page_hint_status == FREE_PAGE_HINT_S_REQUESTED) {
+        config.free_page_hint_cmd_id =
+                       cpu_to_le32(dev->free_page_hint_cmd_id);
+    } else if (dev->free_page_hint_status == FREE_PAGE_HINT_S_STOP) {
+        config.free_page_hint_cmd_id =
                        cpu_to_le32(VIRTIO_BALLOON_CMD_ID_STOP);
-    } else if (dev->free_page_report_status == FREE_PAGE_REPORT_S_DONE) {
-        config.free_page_report_cmd_id =
+    } else if (dev->free_page_hint_status == FREE_PAGE_HINT_S_DONE) {
+        config.free_page_hint_cmd_id =
                        cpu_to_le32(VIRTIO_BALLOON_CMD_ID_DONE);
     }
 
@@ -817,14 +817,14 @@ static int virtio_balloon_post_load_device(void *opaque, int version_id)
     return 0;
 }
 
-static const VMStateDescription vmstate_virtio_balloon_free_page_report = {
+static const VMStateDescription vmstate_virtio_balloon_free_page_hint = {
     .name = "virtio-balloon-device/free-page-report",
     .version_id = 1,
     .minimum_version_id = 1,
     .needed = virtio_balloon_free_page_support,
     .fields = (VMStateField[]) {
-        VMSTATE_UINT32(free_page_report_cmd_id, VirtIOBalloon),
-        VMSTATE_UINT32(free_page_report_status, VirtIOBalloon),
+        VMSTATE_UINT32(free_page_hint_cmd_id, VirtIOBalloon),
+        VMSTATE_UINT32(free_page_hint_status, VirtIOBalloon),
         VMSTATE_END_OF_LIST()
     }
 };
@@ -851,7 +851,7 @@ static const VMStateDescription vmstate_virtio_balloon_device = {
         VMSTATE_END_OF_LIST()
     },
     .subsections = (const VMStateDescription * []) {
-        &vmstate_virtio_balloon_free_page_report,
+        &vmstate_virtio_balloon_free_page_hint,
         &vmstate_virtio_balloon_page_poison,
         NULL
     }
@@ -883,12 +883,12 @@ static void virtio_balloon_device_realize(DeviceState *dev, Error **errp)
                            VIRTIO_BALLOON_F_FREE_PAGE_HINT)) {
         s->free_page_vq = virtio_add_queue(vdev, VIRTQUEUE_MAX_SIZE,
                                            virtio_balloon_handle_free_page_vq);
-        s->free_page_report_status = FREE_PAGE_REPORT_S_STOP;
-        s->free_page_report_cmd_id =
-                           VIRTIO_BALLOON_FREE_PAGE_REPORT_CMD_ID_MIN;
-        s->free_page_report_notify.notify =
-                                       virtio_balloon_free_page_report_notify;
-        precopy_add_notifier(&s->free_page_report_notify);
+        s->free_page_hint_status = FREE_PAGE_HINT_S_STOP;
+        s->free_page_hint_cmd_id =
+                           VIRTIO_BALLOON_FREE_PAGE_HINT_CMD_ID_MIN;
+        s->free_page_hint_notify.notify =
+                                       virtio_balloon_free_page_hint_notify;
+        precopy_add_notifier(&s->free_page_hint_notify);
         if (s->iothread) {
             object_ref(OBJECT(s->iothread));
             s->free_page_bh = aio_bh_new(iothread_get_aio_context(s->iothread),
@@ -919,7 +919,7 @@ static void virtio_balloon_device_unrealize(DeviceState *dev)
     if (virtio_balloon_free_page_support(s)) {
         qemu_bh_delete(s->free_page_bh);
         virtio_balloon_free_page_stop(s);
-        precopy_remove_notifier(&s->free_page_report_notify);
+        precopy_remove_notifier(&s->free_page_hint_notify);
     }
     balloon_stats_destroy_timer(s);
     qemu_remove_balloon_handler(s);
diff --git a/include/hw/virtio/virtio-balloon.h b/include/hw/virtio/virtio-balloon.h
index d49fef00cef2..28fd2b396087 100644
--- a/include/hw/virtio/virtio-balloon.h
+++ b/include/hw/virtio/virtio-balloon.h
@@ -23,7 +23,7 @@
 #define VIRTIO_BALLOON(obj) \
         OBJECT_CHECK(VirtIOBalloon, (obj), TYPE_VIRTIO_BALLOON)
 
-#define VIRTIO_BALLOON_FREE_PAGE_REPORT_CMD_ID_MIN 0x80000000
+#define VIRTIO_BALLOON_FREE_PAGE_HINT_CMD_ID_MIN 0x80000000
 
 typedef struct virtio_balloon_stat VirtIOBalloonStat;
 
@@ -33,20 +33,20 @@ typedef struct virtio_balloon_stat_modern {
        uint64_t val;
 } VirtIOBalloonStatModern;
 
-enum virtio_balloon_free_page_report_status {
-    FREE_PAGE_REPORT_S_STOP = 0,
-    FREE_PAGE_REPORT_S_REQUESTED = 1,
-    FREE_PAGE_REPORT_S_START = 2,
-    FREE_PAGE_REPORT_S_DONE = 3,
+enum virtio_balloon_free_page_hint_status {
+    FREE_PAGE_HINT_S_STOP = 0,
+    FREE_PAGE_HINT_S_REQUESTED = 1,
+    FREE_PAGE_HINT_S_START = 2,
+    FREE_PAGE_HINT_S_DONE = 3,
 };
 
 typedef struct VirtIOBalloon {
     VirtIODevice parent_obj;
     VirtQueue *ivq, *dvq, *svq, *free_page_vq, *reporting_vq;
-    uint32_t free_page_report_status;
+    uint32_t free_page_hint_status;
     uint32_t num_pages;
     uint32_t actual;
-    uint32_t free_page_report_cmd_id;
+    uint32_t free_page_hint_cmd_id;
     uint64_t stats[VIRTIO_BALLOON_S_NR];
     VirtQueueElement *stats_vq_elem;
     size_t stats_vq_offset;
@@ -55,7 +55,7 @@ typedef struct VirtIOBalloon {
     QEMUBH *free_page_bh;
     /*
      * Lock to synchronize threads to access the free page reporting related
-     * fields (e.g. free_page_report_status).
+     * fields (e.g. free_page_hint_status).
      */
     QemuMutex free_page_lock;
     QemuCond  free_page_cond;
@@ -64,7 +64,7 @@ typedef struct VirtIOBalloon {
      * stopped.
      */
     bool block_iothread;
-    NotifierWithReturn free_page_report_notify;
+    NotifierWithReturn free_page_hint_notify;
     int64_t stats_last_update;
     int64_t stats_poll_interval;
     uint32_t host_features;


---------------------------------------------------------------------
To unsubscribe, e-mail: virtio-dev-unsubscribe@lists.oasis-open.org
For additional commands, e-mail: virtio-dev-help@lists.oasis-open.org


^ permalink raw reply related	[flat|nested] 44+ messages in thread

* Re: [PATCH v25 QEMU 0/3] virtio-balloon: add support for page poison and free page reporting
  2020-05-27  4:13 ` [virtio-dev] " Alexander Duyck
@ 2020-06-08 15:37   ` Alexander Duyck
  -1 siblings, 0 replies; 44+ messages in thread
From: Alexander Duyck @ 2020-06-08 15:37 UTC (permalink / raw)
  To: David Hildenbrand, Michael S. Tsirkin; +Cc: virtio-dev, qemu-devel

It's been almost 2 weeks since I submitted this. Just thought I would
follow up and see if there is any ETA on when this might be applied,
or if I missed the need to fix something and resubmit.

Thanks.

- Alex


On Tue, May 26, 2020 at 9:13 PM Alexander Duyck
<alexander.duyck@gmail.com> wrote:
>
> This series provides an asynchronous means of reporting free guest pages
> to QEMU through virtio-balloon so that the memory associated with those
> pages can be dropped and reused by other processes and/or guests on the
> host. Using this it is possible to avoid unnecessary I/O to disk and
> greatly improve performance in the case of memory overcommit on the host.
>
> I originally submitted this patch series back on February 11th 2020[1],
> but at that time I was focused primarily on the kernel portion of this
> patch set. However as of April 7th those patches are now included in
> Linus's kernel tree[2] and so I am submitting the QEMU pieces for
> inclusion.
>
> [1]: https://lore.kernel.org/lkml/20200211224416.29318.44077.stgit@localhost.localdomain/
> [2]: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=b0c504f154718904ae49349147e3b7e6ae91ffdc
>
> Changes from v17:
> Fixed typo in patch 1 title
> Addressed white-space issues reported via checkpatch
> Added braces {} for two if statements to match expected coding style
>
> Changes from v18:
> Updated patches 2 and 3 based on input from dhildenb
> Added comment to patch 2 describing what keeps us from reporting a bad page
> Added patch to address issue with ROM devices being directly writable
>
> Changes from v19:
> Added std-headers change to match changes pushed for linux kernel headers
> Added patch to remove "report" from page hinting code paths
> Updated comment to better explain why we disable hints w/ page poisoning
> Removed code that was modifying config size for poison vs hinting
> Dropped x-page-poison property
> Added code to bounds check the reported region vs the RAM block
> Dropped patch for ROM devices as that was already pulled in by Paolo
>
> Changes from v20:
> Rearranged patches to push Linux header sync patches to front
> Removed association between free page hinting and VIRTIO_BALLOON_F_PAGE_POISON
> Added code to enable VIRTIO_BALLOON_F_PAGE_POISON if page reporting is enabled
> Fixed possible resource leak if poison or qemu_balloon_is_inhibited return true
>
> Changes from v21:
> Added ack for patch 3
> Rewrote patch description for page poison reporting feature
> Made page-poison independent property and set to enabled by default
> Added logic to migrate poison_val
> Added several comments in code to better explain features
> Switched free-page-reporting property to disabled by default
>
> Changes from v22:
> Added ack for patches 4 & 5
> Added additional comment fixes in patch 3 to remove "reporting" reference
> Renamed rvq in patch 5 to reporting_vq to better match linux kernel
> Moved call adding reporting_vq to after free_page_vq
>
> Changes from v23:
> Rebased on latest QEMU
> Dropped patches 1 & 2 as Linux kernel headers were synced
> Added compat machine code for page-poison feature
>
> Changes from v24:
> Moved free page hinting rename to end of set as feature may be removed entirely
> Added code to cleanup reporting_vq
>
> ---
>
> Alexander Duyck (3):
>       virtio-balloon: Implement support for page poison reporting feature
>       virtio-balloon: Provide an interface for free page reporting
>       virtio-balloon: Replace free page hinting references to 'report' with 'hint'
>
>
>  hw/core/machine.c                  |    4 +
>  hw/virtio/virtio-balloon.c         |  179 ++++++++++++++++++++++++++++--------
>  include/hw/virtio/virtio-balloon.h |   23 ++---
>  3 files changed, 155 insertions(+), 51 deletions(-)
>
> --


^ permalink raw reply	[flat|nested] 44+ messages in thread

* [virtio-dev] Re: [PATCH v25 QEMU 0/3] virtio-balloon: add support for page poison and free page reporting
@ 2020-06-08 15:37   ` Alexander Duyck
  0 siblings, 0 replies; 44+ messages in thread
From: Alexander Duyck @ 2020-06-08 15:37 UTC (permalink / raw)
  To: David Hildenbrand, Michael S. Tsirkin; +Cc: virtio-dev, qemu-devel

It's been almost 2 weeks since I submitted this. Just thought I would
follow up and see if there is any ETA on when this might be applied,
or if I missed the need to fix something and resubmit.

Thanks.

- Alex


On Tue, May 26, 2020 at 9:13 PM Alexander Duyck
<alexander.duyck@gmail.com> wrote:
>
> This series provides an asynchronous means of reporting free guest pages
> to QEMU through virtio-balloon so that the memory associated with those
> pages can be dropped and reused by other processes and/or guests on the
> host. Using this it is possible to avoid unnecessary I/O to disk and
> greatly improve performance in the case of memory overcommit on the host.
>
> I originally submitted this patch series back on February 11th 2020[1],
> but at that time I was focused primarily on the kernel portion of this
> patch set. However as of April 7th those patches are now included in
> Linus's kernel tree[2] and so I am submitting the QEMU pieces for
> inclusion.
>
> [1]: https://lore.kernel.org/lkml/20200211224416.29318.44077.stgit@localhost.localdomain/
> [2]: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=b0c504f154718904ae49349147e3b7e6ae91ffdc
>
> Changes from v17:
> Fixed typo in patch 1 title
> Addressed white-space issues reported via checkpatch
> Added braces {} for two if statements to match expected coding style
>
> Changes from v18:
> Updated patches 2 and 3 based on input from dhildenb
> Added comment to patch 2 describing what keeps us from reporting a bad page
> Added patch to address issue with ROM devices being directly writable
>
> Changes from v19:
> Added std-headers change to match changes pushed for linux kernel headers
> Added patch to remove "report" from page hinting code paths
> Updated comment to better explain why we disable hints w/ page poisoning
> Removed code that was modifying config size for poison vs hinting
> Dropped x-page-poison property
> Added code to bounds check the reported region vs the RAM block
> Dropped patch for ROM devices as that was already pulled in by Paolo
>
> Changes from v20:
> Rearranged patches to push Linux header sync patches to front
> Removed association between free page hinting and VIRTIO_BALLOON_F_PAGE_POISON
> Added code to enable VIRTIO_BALLOON_F_PAGE_POISON if page reporting is enabled
> Fixed possible resource leak if poison or qemu_balloon_is_inhibited return true
>
> Changes from v21:
> Added ack for patch 3
> Rewrote patch description for page poison reporting feature
> Made page-poison independent property and set to enabled by default
> Added logic to migrate poison_val
> Added several comments in code to better explain features
> Switched free-page-reporting property to disabled by default
>
> Changes from v22:
> Added ack for patches 4 & 5
> Added additional comment fixes in patch 3 to remove "reporting" reference
> Renamed rvq in patch 5 to reporting_vq to better match linux kernel
> Moved call adding reporting_vq to after free_page_vq
>
> Changes from v23:
> Rebased on latest QEMU
> Dropped patches 1 & 2 as Linux kernel headers were synced
> Added compat machine code for page-poison feature
>
> Changes from v24:
> Moved free page hinting rename to end of set as feature may be removed entirely
> Added code to cleanup reporting_vq
>
> ---
>
> Alexander Duyck (3):
>       virtio-balloon: Implement support for page poison reporting feature
>       virtio-balloon: Provide an interface for free page reporting
>       virtio-balloon: Replace free page hinting references to 'report' with 'hint'
>
>
>  hw/core/machine.c                  |    4 +
>  hw/virtio/virtio-balloon.c         |  179 ++++++++++++++++++++++++++++--------
>  include/hw/virtio/virtio-balloon.h |   23 ++---
>  3 files changed, 155 insertions(+), 51 deletions(-)
>
> --

---------------------------------------------------------------------
To unsubscribe, e-mail: virtio-dev-unsubscribe@lists.oasis-open.org
For additional commands, e-mail: virtio-dev-help@lists.oasis-open.org


^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [PATCH v25 QEMU 0/3] virtio-balloon: add support for page poison and free page reporting
  2020-06-08 15:37   ` [virtio-dev] " Alexander Duyck
@ 2020-06-08 16:27     ` Michael S. Tsirkin
  -1 siblings, 0 replies; 44+ messages in thread
From: Michael S. Tsirkin @ 2020-06-08 16:27 UTC (permalink / raw)
  To: Alexander Duyck; +Cc: virtio-dev, qemu-devel, David Hildenbrand

On Mon, Jun 08, 2020 at 08:37:26AM -0700, Alexander Duyck wrote:
> It's been almost 2 weeks since I submitted this. Just thought I would
> follow up and see if there is any ETA on when this might be applied,
> or if I missed the need to fix something and resubmit.
> 
> Thanks.
> 
> - Alex

Everyone seems happy with so I queued it. Thanks!



^ permalink raw reply	[flat|nested] 44+ messages in thread

* [virtio-dev] Re: [PATCH v25 QEMU 0/3] virtio-balloon: add support for page poison and free page reporting
@ 2020-06-08 16:27     ` Michael S. Tsirkin
  0 siblings, 0 replies; 44+ messages in thread
From: Michael S. Tsirkin @ 2020-06-08 16:27 UTC (permalink / raw)
  To: Alexander Duyck; +Cc: David Hildenbrand, virtio-dev, qemu-devel

On Mon, Jun 08, 2020 at 08:37:26AM -0700, Alexander Duyck wrote:
> It's been almost 2 weeks since I submitted this. Just thought I would
> follow up and see if there is any ETA on when this might be applied,
> or if I missed the need to fix something and resubmit.
> 
> Thanks.
> 
> - Alex

Everyone seems happy with so I queued it. Thanks!


---------------------------------------------------------------------
To unsubscribe, e-mail: virtio-dev-unsubscribe@lists.oasis-open.org
For additional commands, e-mail: virtio-dev-help@lists.oasis-open.org


^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [PATCH v25 QEMU 3/3] virtio-balloon: Replace free page hinting references to 'report' with 'hint'
  2020-05-27  4:14   ` [virtio-dev] " Alexander Duyck
@ 2020-06-13 20:07     ` Alexander Duyck
  -1 siblings, 0 replies; 44+ messages in thread
From: Alexander Duyck @ 2020-06-13 20:07 UTC (permalink / raw)
  To: David Hildenbrand, Michael S. Tsirkin; +Cc: virtio-dev, qemu-devel

On Tue, May 26, 2020 at 9:14 PM Alexander Duyck
<alexander.duyck@gmail.com> wrote:
>
> From: Alexander Duyck <alexander.h.duyck@linux.intel.com>
>
> In an upcoming patch a feature named Free Page Reporting is about to be
> added. In order to avoid any confusion we should drop the use of the word
> 'report' when referring to Free Page Hinting. So what this patch does is go
> through and replace all instances of 'report' with 'hint" when we are
> referring to free page hinting.
>
> Acked-by: David Hildenbrand <david@redhat.com>
> Signed-off-by: Alexander Duyck <alexander.h.duyck@linux.intel.com>
> ---
>  hw/virtio/virtio-balloon.c         |   78 ++++++++++++++++++------------------
>  include/hw/virtio/virtio-balloon.h |   20 +++++----
>  2 files changed, 49 insertions(+), 49 deletions(-)
>
> diff --git a/hw/virtio/virtio-balloon.c b/hw/virtio/virtio-balloon.c
> index 3e2ac1104b5f..dc15409b0bb6 100644
> --- a/hw/virtio/virtio-balloon.c
> +++ b/hw/virtio/virtio-balloon.c

...

> @@ -817,14 +817,14 @@ static int virtio_balloon_post_load_device(void *opaque, int version_id)
>      return 0;
>  }
>
> -static const VMStateDescription vmstate_virtio_balloon_free_page_report = {
> +static const VMStateDescription vmstate_virtio_balloon_free_page_hint = {
>      .name = "virtio-balloon-device/free-page-report",
>      .version_id = 1,
>      .minimum_version_id = 1,
>      .needed = virtio_balloon_free_page_support,
>      .fields = (VMStateField[]) {
> -        VMSTATE_UINT32(free_page_report_cmd_id, VirtIOBalloon),
> -        VMSTATE_UINT32(free_page_report_status, VirtIOBalloon),
> +        VMSTATE_UINT32(free_page_hint_cmd_id, VirtIOBalloon),
> +        VMSTATE_UINT32(free_page_hint_status, VirtIOBalloon),
>          VMSTATE_END_OF_LIST()
>      }
>  };

So I noticed this patch wasn't in the list of patches pulled, but that
is probably for the best since I believe the change above might have
broken migration as VMSTATE_UINT32 does a stringify on the first
parameter.
Any advice on how to address it, or should I just give up on renaming
free_page_report_cmd_id and free_page_report_status?

Looking at this I wonder why we even need to migrate these values? It
seems like if we are completing a migration the cmd_id should always
be "DONE" shouldn't it? It isn't as if we are going to migrate the
hinting from one host to another. We will have to start over which is
essentially the signal that the "DONE" value provides. Same thing for
the status. We shouldn't be able to migrate unless both of these are
already in the "DONE" state so if anything I wonder if we shouldn't
have that as the initial state for the device and just drop the
migration info.

Thanks.

- Alex


^ permalink raw reply	[flat|nested] 44+ messages in thread

* [virtio-dev] Re: [PATCH v25 QEMU 3/3] virtio-balloon: Replace free page hinting references to 'report' with 'hint'
@ 2020-06-13 20:07     ` Alexander Duyck
  0 siblings, 0 replies; 44+ messages in thread
From: Alexander Duyck @ 2020-06-13 20:07 UTC (permalink / raw)
  To: David Hildenbrand, Michael S. Tsirkin; +Cc: virtio-dev, qemu-devel

On Tue, May 26, 2020 at 9:14 PM Alexander Duyck
<alexander.duyck@gmail.com> wrote:
>
> From: Alexander Duyck <alexander.h.duyck@linux.intel.com>
>
> In an upcoming patch a feature named Free Page Reporting is about to be
> added. In order to avoid any confusion we should drop the use of the word
> 'report' when referring to Free Page Hinting. So what this patch does is go
> through and replace all instances of 'report' with 'hint" when we are
> referring to free page hinting.
>
> Acked-by: David Hildenbrand <david@redhat.com>
> Signed-off-by: Alexander Duyck <alexander.h.duyck@linux.intel.com>
> ---
>  hw/virtio/virtio-balloon.c         |   78 ++++++++++++++++++------------------
>  include/hw/virtio/virtio-balloon.h |   20 +++++----
>  2 files changed, 49 insertions(+), 49 deletions(-)
>
> diff --git a/hw/virtio/virtio-balloon.c b/hw/virtio/virtio-balloon.c
> index 3e2ac1104b5f..dc15409b0bb6 100644
> --- a/hw/virtio/virtio-balloon.c
> +++ b/hw/virtio/virtio-balloon.c

...

> @@ -817,14 +817,14 @@ static int virtio_balloon_post_load_device(void *opaque, int version_id)
>      return 0;
>  }
>
> -static const VMStateDescription vmstate_virtio_balloon_free_page_report = {
> +static const VMStateDescription vmstate_virtio_balloon_free_page_hint = {
>      .name = "virtio-balloon-device/free-page-report",
>      .version_id = 1,
>      .minimum_version_id = 1,
>      .needed = virtio_balloon_free_page_support,
>      .fields = (VMStateField[]) {
> -        VMSTATE_UINT32(free_page_report_cmd_id, VirtIOBalloon),
> -        VMSTATE_UINT32(free_page_report_status, VirtIOBalloon),
> +        VMSTATE_UINT32(free_page_hint_cmd_id, VirtIOBalloon),
> +        VMSTATE_UINT32(free_page_hint_status, VirtIOBalloon),
>          VMSTATE_END_OF_LIST()
>      }
>  };

So I noticed this patch wasn't in the list of patches pulled, but that
is probably for the best since I believe the change above might have
broken migration as VMSTATE_UINT32 does a stringify on the first
parameter.
Any advice on how to address it, or should I just give up on renaming
free_page_report_cmd_id and free_page_report_status?

Looking at this I wonder why we even need to migrate these values? It
seems like if we are completing a migration the cmd_id should always
be "DONE" shouldn't it? It isn't as if we are going to migrate the
hinting from one host to another. We will have to start over which is
essentially the signal that the "DONE" value provides. Same thing for
the status. We shouldn't be able to migrate unless both of these are
already in the "DONE" state so if anything I wonder if we shouldn't
have that as the initial state for the device and just drop the
migration info.

Thanks.

- Alex

---------------------------------------------------------------------
To unsubscribe, e-mail: virtio-dev-unsubscribe@lists.oasis-open.org
For additional commands, e-mail: virtio-dev-help@lists.oasis-open.org


^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [PATCH v25 QEMU 3/3] virtio-balloon: Replace free page hinting references to 'report' with 'hint'
  2020-06-13 20:07     ` [virtio-dev] " Alexander Duyck
@ 2020-06-18 12:54       ` David Hildenbrand
  -1 siblings, 0 replies; 44+ messages in thread
From: David Hildenbrand @ 2020-06-18 12:54 UTC (permalink / raw)
  To: Alexander Duyck, Michael S. Tsirkin; +Cc: virtio-dev, qemu-devel

On 13.06.20 22:07, Alexander Duyck wrote:
> On Tue, May 26, 2020 at 9:14 PM Alexander Duyck
> <alexander.duyck@gmail.com> wrote:
>>
>> From: Alexander Duyck <alexander.h.duyck@linux.intel.com>
>>
>> In an upcoming patch a feature named Free Page Reporting is about to be
>> added. In order to avoid any confusion we should drop the use of the word
>> 'report' when referring to Free Page Hinting. So what this patch does is go
>> through and replace all instances of 'report' with 'hint" when we are
>> referring to free page hinting.
>>
>> Acked-by: David Hildenbrand <david@redhat.com>
>> Signed-off-by: Alexander Duyck <alexander.h.duyck@linux.intel.com>
>> ---
>>  hw/virtio/virtio-balloon.c         |   78 ++++++++++++++++++------------------
>>  include/hw/virtio/virtio-balloon.h |   20 +++++----
>>  2 files changed, 49 insertions(+), 49 deletions(-)
>>
>> diff --git a/hw/virtio/virtio-balloon.c b/hw/virtio/virtio-balloon.c
>> index 3e2ac1104b5f..dc15409b0bb6 100644
>> --- a/hw/virtio/virtio-balloon.c
>> +++ b/hw/virtio/virtio-balloon.c
> 
> ...
> 
>> @@ -817,14 +817,14 @@ static int virtio_balloon_post_load_device(void *opaque, int version_id)
>>      return 0;
>>  }
>>
>> -static const VMStateDescription vmstate_virtio_balloon_free_page_report = {
>> +static const VMStateDescription vmstate_virtio_balloon_free_page_hint = {
>>      .name = "virtio-balloon-device/free-page-report",
>>      .version_id = 1,
>>      .minimum_version_id = 1,
>>      .needed = virtio_balloon_free_page_support,
>>      .fields = (VMStateField[]) {
>> -        VMSTATE_UINT32(free_page_report_cmd_id, VirtIOBalloon),
>> -        VMSTATE_UINT32(free_page_report_status, VirtIOBalloon),
>> +        VMSTATE_UINT32(free_page_hint_cmd_id, VirtIOBalloon),
>> +        VMSTATE_UINT32(free_page_hint_status, VirtIOBalloon),
>>          VMSTATE_END_OF_LIST()
>>      }
>>  };
> 
> So I noticed this patch wasn't in the list of patches pulled, but that
> is probably for the best since I believe the change above might have
> broken migration as VMSTATE_UINT32 does a stringify on the first
> parameter.

Indeed, it's the name of the vmstate field. But I don't think it is
relevant for migration. It's and indicator if a field is valid and it's
used in traces/error messages.

See git grep "field->name"

I don't think renaming this is problematic. Can you rebase and resent?
Thanks!

> Any advice on how to address it, or should I just give up on renaming
> free_page_report_cmd_id and free_page_report_status?
> 
> Looking at this I wonder why we even need to migrate these values? It
> seems like if we are completing a migration the cmd_id should always
> be "DONE" shouldn't it? It isn't as if we are going to migrate the

The *status* should be DONE IIUC. The cmd_id might be relevant, no? It's
always incremented until it wraps.

> hinting from one host to another. We will have to start over which is
> essentially the signal that the "DONE" value provides. Same thing for
> the status. We shouldn't be able to migrate unless both of these are
> already in the "DONE" state so if anything I wonder if we shouldn't
> have that as the initial state for the device and just drop the
> migration info.

We'll have to glue that to a compat machine unfortunately, so we can
just keep migrating it ... :(


-- 
Thanks,

David / dhildenb



^ permalink raw reply	[flat|nested] 44+ messages in thread

* [virtio-dev] Re: [PATCH v25 QEMU 3/3] virtio-balloon: Replace free page hinting references to 'report' with 'hint'
@ 2020-06-18 12:54       ` David Hildenbrand
  0 siblings, 0 replies; 44+ messages in thread
From: David Hildenbrand @ 2020-06-18 12:54 UTC (permalink / raw)
  To: Alexander Duyck, Michael S. Tsirkin; +Cc: virtio-dev, qemu-devel

On 13.06.20 22:07, Alexander Duyck wrote:
> On Tue, May 26, 2020 at 9:14 PM Alexander Duyck
> <alexander.duyck@gmail.com> wrote:
>>
>> From: Alexander Duyck <alexander.h.duyck@linux.intel.com>
>>
>> In an upcoming patch a feature named Free Page Reporting is about to be
>> added. In order to avoid any confusion we should drop the use of the word
>> 'report' when referring to Free Page Hinting. So what this patch does is go
>> through and replace all instances of 'report' with 'hint" when we are
>> referring to free page hinting.
>>
>> Acked-by: David Hildenbrand <david@redhat.com>
>> Signed-off-by: Alexander Duyck <alexander.h.duyck@linux.intel.com>
>> ---
>>  hw/virtio/virtio-balloon.c         |   78 ++++++++++++++++++------------------
>>  include/hw/virtio/virtio-balloon.h |   20 +++++----
>>  2 files changed, 49 insertions(+), 49 deletions(-)
>>
>> diff --git a/hw/virtio/virtio-balloon.c b/hw/virtio/virtio-balloon.c
>> index 3e2ac1104b5f..dc15409b0bb6 100644
>> --- a/hw/virtio/virtio-balloon.c
>> +++ b/hw/virtio/virtio-balloon.c
> 
> ...
> 
>> @@ -817,14 +817,14 @@ static int virtio_balloon_post_load_device(void *opaque, int version_id)
>>      return 0;
>>  }
>>
>> -static const VMStateDescription vmstate_virtio_balloon_free_page_report = {
>> +static const VMStateDescription vmstate_virtio_balloon_free_page_hint = {
>>      .name = "virtio-balloon-device/free-page-report",
>>      .version_id = 1,
>>      .minimum_version_id = 1,
>>      .needed = virtio_balloon_free_page_support,
>>      .fields = (VMStateField[]) {
>> -        VMSTATE_UINT32(free_page_report_cmd_id, VirtIOBalloon),
>> -        VMSTATE_UINT32(free_page_report_status, VirtIOBalloon),
>> +        VMSTATE_UINT32(free_page_hint_cmd_id, VirtIOBalloon),
>> +        VMSTATE_UINT32(free_page_hint_status, VirtIOBalloon),
>>          VMSTATE_END_OF_LIST()
>>      }
>>  };
> 
> So I noticed this patch wasn't in the list of patches pulled, but that
> is probably for the best since I believe the change above might have
> broken migration as VMSTATE_UINT32 does a stringify on the first
> parameter.

Indeed, it's the name of the vmstate field. But I don't think it is
relevant for migration. It's and indicator if a field is valid and it's
used in traces/error messages.

See git grep "field->name"

I don't think renaming this is problematic. Can you rebase and resent?
Thanks!

> Any advice on how to address it, or should I just give up on renaming
> free_page_report_cmd_id and free_page_report_status?
> 
> Looking at this I wonder why we even need to migrate these values? It
> seems like if we are completing a migration the cmd_id should always
> be "DONE" shouldn't it? It isn't as if we are going to migrate the

The *status* should be DONE IIUC. The cmd_id might be relevant, no? It's
always incremented until it wraps.

> hinting from one host to another. We will have to start over which is
> essentially the signal that the "DONE" value provides. Same thing for
> the status. We shouldn't be able to migrate unless both of these are
> already in the "DONE" state so if anything I wonder if we shouldn't
> have that as the initial state for the device and just drop the
> migration info.

We'll have to glue that to a compat machine unfortunately, so we can
just keep migrating it ... :(


-- 
Thanks,

David / dhildenb


---------------------------------------------------------------------
To unsubscribe, e-mail: virtio-dev-unsubscribe@lists.oasis-open.org
For additional commands, e-mail: virtio-dev-help@lists.oasis-open.org


^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [PATCH v25 QEMU 3/3] virtio-balloon: Replace free page hinting references to 'report' with 'hint'
  2020-06-18 12:54       ` [virtio-dev] " David Hildenbrand
@ 2020-06-18 15:14         ` Alexander Duyck
  -1 siblings, 0 replies; 44+ messages in thread
From: Alexander Duyck @ 2020-06-18 15:14 UTC (permalink / raw)
  To: David Hildenbrand; +Cc: virtio-dev, qemu-devel, Michael S. Tsirkin

On Thu, Jun 18, 2020 at 5:54 AM David Hildenbrand <david@redhat.com> wrote:
>
> On 13.06.20 22:07, Alexander Duyck wrote:
> > On Tue, May 26, 2020 at 9:14 PM Alexander Duyck
> > <alexander.duyck@gmail.com> wrote:
> >>
> >> From: Alexander Duyck <alexander.h.duyck@linux.intel.com>
> >>
> >> In an upcoming patch a feature named Free Page Reporting is about to be
> >> added. In order to avoid any confusion we should drop the use of the word
> >> 'report' when referring to Free Page Hinting. So what this patch does is go
> >> through and replace all instances of 'report' with 'hint" when we are
> >> referring to free page hinting.
> >>
> >> Acked-by: David Hildenbrand <david@redhat.com>
> >> Signed-off-by: Alexander Duyck <alexander.h.duyck@linux.intel.com>
> >> ---
> >>  hw/virtio/virtio-balloon.c         |   78 ++++++++++++++++++------------------
> >>  include/hw/virtio/virtio-balloon.h |   20 +++++----
> >>  2 files changed, 49 insertions(+), 49 deletions(-)
> >>
> >> diff --git a/hw/virtio/virtio-balloon.c b/hw/virtio/virtio-balloon.c
> >> index 3e2ac1104b5f..dc15409b0bb6 100644
> >> --- a/hw/virtio/virtio-balloon.c
> >> +++ b/hw/virtio/virtio-balloon.c
> >
> > ...
> >
> >> @@ -817,14 +817,14 @@ static int virtio_balloon_post_load_device(void *opaque, int version_id)
> >>      return 0;
> >>  }
> >>
> >> -static const VMStateDescription vmstate_virtio_balloon_free_page_report = {
> >> +static const VMStateDescription vmstate_virtio_balloon_free_page_hint = {
> >>      .name = "virtio-balloon-device/free-page-report",
> >>      .version_id = 1,
> >>      .minimum_version_id = 1,
> >>      .needed = virtio_balloon_free_page_support,
> >>      .fields = (VMStateField[]) {
> >> -        VMSTATE_UINT32(free_page_report_cmd_id, VirtIOBalloon),
> >> -        VMSTATE_UINT32(free_page_report_status, VirtIOBalloon),
> >> +        VMSTATE_UINT32(free_page_hint_cmd_id, VirtIOBalloon),
> >> +        VMSTATE_UINT32(free_page_hint_status, VirtIOBalloon),
> >>          VMSTATE_END_OF_LIST()
> >>      }
> >>  };
> >
> > So I noticed this patch wasn't in the list of patches pulled, but that
> > is probably for the best since I believe the change above might have
> > broken migration as VMSTATE_UINT32 does a stringify on the first
> > parameter.
>
> Indeed, it's the name of the vmstate field. But I don't think it is
> relevant for migration. It's and indicator if a field is valid and it's
> used in traces/error messages.
>
> See git grep "field->name"
>
> I don't think renaming this is problematic. Can you rebase and resent?
> Thanks!

Okay, I will.

> > Any advice on how to address it, or should I just give up on renaming
> > free_page_report_cmd_id and free_page_report_status?
> >
> > Looking at this I wonder why we even need to migrate these values? It
> > seems like if we are completing a migration the cmd_id should always
> > be "DONE" shouldn't it? It isn't as if we are going to migrate the
>
> The *status* should be DONE IIUC. The cmd_id might be relevant, no? It's
> always incremented until it wraps.

The thing is, the cmd_id visible to the driver if the status is DONE
is the cmd_id value for DONE. So as long as the driver acknowledges
the value we could essentially start over the cmd_id without any
negative effect. The driver would have to put down a new descriptor to
start a block of hinting in order to begin reporting again so there
shouldn't be any risk of us falsely hinting pages that were in a
previous epoch.

Ugh, although now looking at it I think we might have a bug in the
QEMU code in that the driver could in theory force its way past a
"STOP" by just replaying the last command_id descriptor and then keep
going. Should be a pretty easy fix though as we should only allow a
transition to S_START if the status is S_REQUESTED/

> > hinting from one host to another. We will have to start over which is
> > essentially the signal that the "DONE" value provides. Same thing for
> > the status. We shouldn't be able to migrate unless both of these are
> > already in the "DONE" state so if anything I wonder if we shouldn't
> > have that as the initial state for the device and just drop the
> > migration info.
>
> We'll have to glue that to a compat machine unfortunately, so we can
> just keep migrating it ... :(

Yeah, I kind of figured that would be the case. However if the name
change is not an issue then it should not be a problem.

Thanks.

- Alex


^ permalink raw reply	[flat|nested] 44+ messages in thread

* [virtio-dev] Re: [PATCH v25 QEMU 3/3] virtio-balloon: Replace free page hinting references to 'report' with 'hint'
@ 2020-06-18 15:14         ` Alexander Duyck
  0 siblings, 0 replies; 44+ messages in thread
From: Alexander Duyck @ 2020-06-18 15:14 UTC (permalink / raw)
  To: David Hildenbrand; +Cc: Michael S. Tsirkin, virtio-dev, qemu-devel

On Thu, Jun 18, 2020 at 5:54 AM David Hildenbrand <david@redhat.com> wrote:
>
> On 13.06.20 22:07, Alexander Duyck wrote:
> > On Tue, May 26, 2020 at 9:14 PM Alexander Duyck
> > <alexander.duyck@gmail.com> wrote:
> >>
> >> From: Alexander Duyck <alexander.h.duyck@linux.intel.com>
> >>
> >> In an upcoming patch a feature named Free Page Reporting is about to be
> >> added. In order to avoid any confusion we should drop the use of the word
> >> 'report' when referring to Free Page Hinting. So what this patch does is go
> >> through and replace all instances of 'report' with 'hint" when we are
> >> referring to free page hinting.
> >>
> >> Acked-by: David Hildenbrand <david@redhat.com>
> >> Signed-off-by: Alexander Duyck <alexander.h.duyck@linux.intel.com>
> >> ---
> >>  hw/virtio/virtio-balloon.c         |   78 ++++++++++++++++++------------------
> >>  include/hw/virtio/virtio-balloon.h |   20 +++++----
> >>  2 files changed, 49 insertions(+), 49 deletions(-)
> >>
> >> diff --git a/hw/virtio/virtio-balloon.c b/hw/virtio/virtio-balloon.c
> >> index 3e2ac1104b5f..dc15409b0bb6 100644
> >> --- a/hw/virtio/virtio-balloon.c
> >> +++ b/hw/virtio/virtio-balloon.c
> >
> > ...
> >
> >> @@ -817,14 +817,14 @@ static int virtio_balloon_post_load_device(void *opaque, int version_id)
> >>      return 0;
> >>  }
> >>
> >> -static const VMStateDescription vmstate_virtio_balloon_free_page_report = {
> >> +static const VMStateDescription vmstate_virtio_balloon_free_page_hint = {
> >>      .name = "virtio-balloon-device/free-page-report",
> >>      .version_id = 1,
> >>      .minimum_version_id = 1,
> >>      .needed = virtio_balloon_free_page_support,
> >>      .fields = (VMStateField[]) {
> >> -        VMSTATE_UINT32(free_page_report_cmd_id, VirtIOBalloon),
> >> -        VMSTATE_UINT32(free_page_report_status, VirtIOBalloon),
> >> +        VMSTATE_UINT32(free_page_hint_cmd_id, VirtIOBalloon),
> >> +        VMSTATE_UINT32(free_page_hint_status, VirtIOBalloon),
> >>          VMSTATE_END_OF_LIST()
> >>      }
> >>  };
> >
> > So I noticed this patch wasn't in the list of patches pulled, but that
> > is probably for the best since I believe the change above might have
> > broken migration as VMSTATE_UINT32 does a stringify on the first
> > parameter.
>
> Indeed, it's the name of the vmstate field. But I don't think it is
> relevant for migration. It's and indicator if a field is valid and it's
> used in traces/error messages.
>
> See git grep "field->name"
>
> I don't think renaming this is problematic. Can you rebase and resent?
> Thanks!

Okay, I will.

> > Any advice on how to address it, or should I just give up on renaming
> > free_page_report_cmd_id and free_page_report_status?
> >
> > Looking at this I wonder why we even need to migrate these values? It
> > seems like if we are completing a migration the cmd_id should always
> > be "DONE" shouldn't it? It isn't as if we are going to migrate the
>
> The *status* should be DONE IIUC. The cmd_id might be relevant, no? It's
> always incremented until it wraps.

The thing is, the cmd_id visible to the driver if the status is DONE
is the cmd_id value for DONE. So as long as the driver acknowledges
the value we could essentially start over the cmd_id without any
negative effect. The driver would have to put down a new descriptor to
start a block of hinting in order to begin reporting again so there
shouldn't be any risk of us falsely hinting pages that were in a
previous epoch.

Ugh, although now looking at it I think we might have a bug in the
QEMU code in that the driver could in theory force its way past a
"STOP" by just replaying the last command_id descriptor and then keep
going. Should be a pretty easy fix though as we should only allow a
transition to S_START if the status is S_REQUESTED/

> > hinting from one host to another. We will have to start over which is
> > essentially the signal that the "DONE" value provides. Same thing for
> > the status. We shouldn't be able to migrate unless both of these are
> > already in the "DONE" state so if anything I wonder if we shouldn't
> > have that as the initial state for the device and just drop the
> > migration info.
>
> We'll have to glue that to a compat machine unfortunately, so we can
> just keep migrating it ... :(

Yeah, I kind of figured that would be the case. However if the name
change is not an issue then it should not be a problem.

Thanks.

- Alex

---------------------------------------------------------------------
To unsubscribe, e-mail: virtio-dev-unsubscribe@lists.oasis-open.org
For additional commands, e-mail: virtio-dev-help@lists.oasis-open.org


^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [PATCH v25 QEMU 3/3] virtio-balloon: Replace free page hinting references to 'report' with 'hint'
  2020-06-18 15:14         ` [virtio-dev] " Alexander Duyck
@ 2020-06-18 15:58           ` David Hildenbrand
  -1 siblings, 0 replies; 44+ messages in thread
From: David Hildenbrand @ 2020-06-18 15:58 UTC (permalink / raw)
  To: Alexander Duyck; +Cc: virtio-dev, qemu-devel, Michael S. Tsirkin

On 18.06.20 17:14, Alexander Duyck wrote:
> On Thu, Jun 18, 2020 at 5:54 AM David Hildenbrand <david@redhat.com> wrote:
>>
>> On 13.06.20 22:07, Alexander Duyck wrote:
>>> On Tue, May 26, 2020 at 9:14 PM Alexander Duyck
>>> <alexander.duyck@gmail.com> wrote:
>>>>
>>>> From: Alexander Duyck <alexander.h.duyck@linux.intel.com>
>>>>
>>>> In an upcoming patch a feature named Free Page Reporting is about to be
>>>> added. In order to avoid any confusion we should drop the use of the word
>>>> 'report' when referring to Free Page Hinting. So what this patch does is go
>>>> through and replace all instances of 'report' with 'hint" when we are
>>>> referring to free page hinting.
>>>>
>>>> Acked-by: David Hildenbrand <david@redhat.com>
>>>> Signed-off-by: Alexander Duyck <alexander.h.duyck@linux.intel.com>
>>>> ---
>>>>  hw/virtio/virtio-balloon.c         |   78 ++++++++++++++++++------------------
>>>>  include/hw/virtio/virtio-balloon.h |   20 +++++----
>>>>  2 files changed, 49 insertions(+), 49 deletions(-)
>>>>
>>>> diff --git a/hw/virtio/virtio-balloon.c b/hw/virtio/virtio-balloon.c
>>>> index 3e2ac1104b5f..dc15409b0bb6 100644
>>>> --- a/hw/virtio/virtio-balloon.c
>>>> +++ b/hw/virtio/virtio-balloon.c
>>>
>>> ...
>>>
>>>> @@ -817,14 +817,14 @@ static int virtio_balloon_post_load_device(void *opaque, int version_id)
>>>>      return 0;
>>>>  }
>>>>
>>>> -static const VMStateDescription vmstate_virtio_balloon_free_page_report = {
>>>> +static const VMStateDescription vmstate_virtio_balloon_free_page_hint = {
>>>>      .name = "virtio-balloon-device/free-page-report",
>>>>      .version_id = 1,
>>>>      .minimum_version_id = 1,
>>>>      .needed = virtio_balloon_free_page_support,
>>>>      .fields = (VMStateField[]) {
>>>> -        VMSTATE_UINT32(free_page_report_cmd_id, VirtIOBalloon),
>>>> -        VMSTATE_UINT32(free_page_report_status, VirtIOBalloon),
>>>> +        VMSTATE_UINT32(free_page_hint_cmd_id, VirtIOBalloon),
>>>> +        VMSTATE_UINT32(free_page_hint_status, VirtIOBalloon),
>>>>          VMSTATE_END_OF_LIST()
>>>>      }
>>>>  };
>>>
>>> So I noticed this patch wasn't in the list of patches pulled, but that
>>> is probably for the best since I believe the change above might have
>>> broken migration as VMSTATE_UINT32 does a stringify on the first
>>> parameter.
>>
>> Indeed, it's the name of the vmstate field. But I don't think it is
>> relevant for migration. It's and indicator if a field is valid and it's
>> used in traces/error messages.
>>
>> See git grep "field->name"
>>
>> I don't think renaming this is problematic. Can you rebase and resent?
>> Thanks!
> 
> Okay, I will.
> 
>>> Any advice on how to address it, or should I just give up on renaming
>>> free_page_report_cmd_id and free_page_report_status?
>>>
>>> Looking at this I wonder why we even need to migrate these values? It
>>> seems like if we are completing a migration the cmd_id should always
>>> be "DONE" shouldn't it? It isn't as if we are going to migrate the
>>
>> The *status* should be DONE IIUC. The cmd_id might be relevant, no? It's
>> always incremented until it wraps.
> 
> The thing is, the cmd_id visible to the driver if the status is DONE
> is the cmd_id value for DONE. So as long as the driver acknowledges
> the value we could essentially start over the cmd_id without any
> negative effect. The driver would have to put down a new descriptor to
> start a block of hinting in order to begin reporting again so there
> shouldn't be any risk of us falsely hinting pages that were in a
> previous epoch.
> 
> Ugh, although now looking at it I think we might have a bug in the
> QEMU code in that the driver could in theory force its way past a
> "STOP" by just replaying the last command_id descriptor and then keep
> going. Should be a pretty easy fix though as we should only allow a
> transition to S_START if the status is S_REQUESTED/

Ugh, ...

@MST, you might have missed that in another discussion, what's your
general opinion about removing free page hinting in QEMU (and Linux)? We
keep finding issues in the QEMU implementation, including non-trivial
ones, and have to speculate about the actual semantics. I can see that
e.g., libvirt does not support it yet.

-- 
Thanks,

David / dhildenb



^ permalink raw reply	[flat|nested] 44+ messages in thread

* [virtio-dev] Re: [PATCH v25 QEMU 3/3] virtio-balloon: Replace free page hinting references to 'report' with 'hint'
@ 2020-06-18 15:58           ` David Hildenbrand
  0 siblings, 0 replies; 44+ messages in thread
From: David Hildenbrand @ 2020-06-18 15:58 UTC (permalink / raw)
  To: Alexander Duyck; +Cc: Michael S. Tsirkin, virtio-dev, qemu-devel

On 18.06.20 17:14, Alexander Duyck wrote:
> On Thu, Jun 18, 2020 at 5:54 AM David Hildenbrand <david@redhat.com> wrote:
>>
>> On 13.06.20 22:07, Alexander Duyck wrote:
>>> On Tue, May 26, 2020 at 9:14 PM Alexander Duyck
>>> <alexander.duyck@gmail.com> wrote:
>>>>
>>>> From: Alexander Duyck <alexander.h.duyck@linux.intel.com>
>>>>
>>>> In an upcoming patch a feature named Free Page Reporting is about to be
>>>> added. In order to avoid any confusion we should drop the use of the word
>>>> 'report' when referring to Free Page Hinting. So what this patch does is go
>>>> through and replace all instances of 'report' with 'hint" when we are
>>>> referring to free page hinting.
>>>>
>>>> Acked-by: David Hildenbrand <david@redhat.com>
>>>> Signed-off-by: Alexander Duyck <alexander.h.duyck@linux.intel.com>
>>>> ---
>>>>  hw/virtio/virtio-balloon.c         |   78 ++++++++++++++++++------------------
>>>>  include/hw/virtio/virtio-balloon.h |   20 +++++----
>>>>  2 files changed, 49 insertions(+), 49 deletions(-)
>>>>
>>>> diff --git a/hw/virtio/virtio-balloon.c b/hw/virtio/virtio-balloon.c
>>>> index 3e2ac1104b5f..dc15409b0bb6 100644
>>>> --- a/hw/virtio/virtio-balloon.c
>>>> +++ b/hw/virtio/virtio-balloon.c
>>>
>>> ...
>>>
>>>> @@ -817,14 +817,14 @@ static int virtio_balloon_post_load_device(void *opaque, int version_id)
>>>>      return 0;
>>>>  }
>>>>
>>>> -static const VMStateDescription vmstate_virtio_balloon_free_page_report = {
>>>> +static const VMStateDescription vmstate_virtio_balloon_free_page_hint = {
>>>>      .name = "virtio-balloon-device/free-page-report",
>>>>      .version_id = 1,
>>>>      .minimum_version_id = 1,
>>>>      .needed = virtio_balloon_free_page_support,
>>>>      .fields = (VMStateField[]) {
>>>> -        VMSTATE_UINT32(free_page_report_cmd_id, VirtIOBalloon),
>>>> -        VMSTATE_UINT32(free_page_report_status, VirtIOBalloon),
>>>> +        VMSTATE_UINT32(free_page_hint_cmd_id, VirtIOBalloon),
>>>> +        VMSTATE_UINT32(free_page_hint_status, VirtIOBalloon),
>>>>          VMSTATE_END_OF_LIST()
>>>>      }
>>>>  };
>>>
>>> So I noticed this patch wasn't in the list of patches pulled, but that
>>> is probably for the best since I believe the change above might have
>>> broken migration as VMSTATE_UINT32 does a stringify on the first
>>> parameter.
>>
>> Indeed, it's the name of the vmstate field. But I don't think it is
>> relevant for migration. It's and indicator if a field is valid and it's
>> used in traces/error messages.
>>
>> See git grep "field->name"
>>
>> I don't think renaming this is problematic. Can you rebase and resent?
>> Thanks!
> 
> Okay, I will.
> 
>>> Any advice on how to address it, or should I just give up on renaming
>>> free_page_report_cmd_id and free_page_report_status?
>>>
>>> Looking at this I wonder why we even need to migrate these values? It
>>> seems like if we are completing a migration the cmd_id should always
>>> be "DONE" shouldn't it? It isn't as if we are going to migrate the
>>
>> The *status* should be DONE IIUC. The cmd_id might be relevant, no? It's
>> always incremented until it wraps.
> 
> The thing is, the cmd_id visible to the driver if the status is DONE
> is the cmd_id value for DONE. So as long as the driver acknowledges
> the value we could essentially start over the cmd_id without any
> negative effect. The driver would have to put down a new descriptor to
> start a block of hinting in order to begin reporting again so there
> shouldn't be any risk of us falsely hinting pages that were in a
> previous epoch.
> 
> Ugh, although now looking at it I think we might have a bug in the
> QEMU code in that the driver could in theory force its way past a
> "STOP" by just replaying the last command_id descriptor and then keep
> going. Should be a pretty easy fix though as we should only allow a
> transition to S_START if the status is S_REQUESTED/

Ugh, ...

@MST, you might have missed that in another discussion, what's your
general opinion about removing free page hinting in QEMU (and Linux)? We
keep finding issues in the QEMU implementation, including non-trivial
ones, and have to speculate about the actual semantics. I can see that
e.g., libvirt does not support it yet.

-- 
Thanks,

David / dhildenb


---------------------------------------------------------------------
To unsubscribe, e-mail: virtio-dev-unsubscribe@lists.oasis-open.org
For additional commands, e-mail: virtio-dev-help@lists.oasis-open.org


^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [PATCH v25 QEMU 3/3] virtio-balloon: Replace free page hinting references to 'report' with 'hint'
  2020-06-18 15:58           ` [virtio-dev] " David Hildenbrand
@ 2020-06-18 16:09             ` Michael S. Tsirkin
  -1 siblings, 0 replies; 44+ messages in thread
From: Michael S. Tsirkin @ 2020-06-18 16:09 UTC (permalink / raw)
  To: David Hildenbrand; +Cc: virtio-dev, qemu-devel, Alexander Duyck

On Thu, Jun 18, 2020 at 05:58:28PM +0200, David Hildenbrand wrote:
> On 18.06.20 17:14, Alexander Duyck wrote:
> > On Thu, Jun 18, 2020 at 5:54 AM David Hildenbrand <david@redhat.com> wrote:
> >>
> >> On 13.06.20 22:07, Alexander Duyck wrote:
> >>> On Tue, May 26, 2020 at 9:14 PM Alexander Duyck
> >>> <alexander.duyck@gmail.com> wrote:
> >>>>
> >>>> From: Alexander Duyck <alexander.h.duyck@linux.intel.com>
> >>>>
> >>>> In an upcoming patch a feature named Free Page Reporting is about to be
> >>>> added. In order to avoid any confusion we should drop the use of the word
> >>>> 'report' when referring to Free Page Hinting. So what this patch does is go
> >>>> through and replace all instances of 'report' with 'hint" when we are
> >>>> referring to free page hinting.
> >>>>
> >>>> Acked-by: David Hildenbrand <david@redhat.com>
> >>>> Signed-off-by: Alexander Duyck <alexander.h.duyck@linux.intel.com>
> >>>> ---
> >>>>  hw/virtio/virtio-balloon.c         |   78 ++++++++++++++++++------------------
> >>>>  include/hw/virtio/virtio-balloon.h |   20 +++++----
> >>>>  2 files changed, 49 insertions(+), 49 deletions(-)
> >>>>
> >>>> diff --git a/hw/virtio/virtio-balloon.c b/hw/virtio/virtio-balloon.c
> >>>> index 3e2ac1104b5f..dc15409b0bb6 100644
> >>>> --- a/hw/virtio/virtio-balloon.c
> >>>> +++ b/hw/virtio/virtio-balloon.c
> >>>
> >>> ...
> >>>
> >>>> @@ -817,14 +817,14 @@ static int virtio_balloon_post_load_device(void *opaque, int version_id)
> >>>>      return 0;
> >>>>  }
> >>>>
> >>>> -static const VMStateDescription vmstate_virtio_balloon_free_page_report = {
> >>>> +static const VMStateDescription vmstate_virtio_balloon_free_page_hint = {
> >>>>      .name = "virtio-balloon-device/free-page-report",
> >>>>      .version_id = 1,
> >>>>      .minimum_version_id = 1,
> >>>>      .needed = virtio_balloon_free_page_support,
> >>>>      .fields = (VMStateField[]) {
> >>>> -        VMSTATE_UINT32(free_page_report_cmd_id, VirtIOBalloon),
> >>>> -        VMSTATE_UINT32(free_page_report_status, VirtIOBalloon),
> >>>> +        VMSTATE_UINT32(free_page_hint_cmd_id, VirtIOBalloon),
> >>>> +        VMSTATE_UINT32(free_page_hint_status, VirtIOBalloon),
> >>>>          VMSTATE_END_OF_LIST()
> >>>>      }
> >>>>  };
> >>>
> >>> So I noticed this patch wasn't in the list of patches pulled, but that
> >>> is probably for the best since I believe the change above might have
> >>> broken migration as VMSTATE_UINT32 does a stringify on the first
> >>> parameter.
> >>
> >> Indeed, it's the name of the vmstate field. But I don't think it is
> >> relevant for migration. It's and indicator if a field is valid and it's
> >> used in traces/error messages.
> >>
> >> See git grep "field->name"
> >>
> >> I don't think renaming this is problematic. Can you rebase and resent?
> >> Thanks!
> > 
> > Okay, I will.
> > 
> >>> Any advice on how to address it, or should I just give up on renaming
> >>> free_page_report_cmd_id and free_page_report_status?
> >>>
> >>> Looking at this I wonder why we even need to migrate these values? It
> >>> seems like if we are completing a migration the cmd_id should always
> >>> be "DONE" shouldn't it? It isn't as if we are going to migrate the
> >>
> >> The *status* should be DONE IIUC. The cmd_id might be relevant, no? It's
> >> always incremented until it wraps.
> > 
> > The thing is, the cmd_id visible to the driver if the status is DONE
> > is the cmd_id value for DONE. So as long as the driver acknowledges
> > the value we could essentially start over the cmd_id without any
> > negative effect. The driver would have to put down a new descriptor to
> > start a block of hinting in order to begin reporting again so there
> > shouldn't be any risk of us falsely hinting pages that were in a
> > previous epoch.
> > 
> > Ugh, although now looking at it I think we might have a bug in the
> > QEMU code in that the driver could in theory force its way past a
> > "STOP" by just replaying the last command_id descriptor and then keep
> > going. Should be a pretty easy fix though as we should only allow a
> > transition to S_START if the status is S_REQUESTED/
> 
> Ugh, ...
> 
> @MST, you might have missed that in another discussion, what's your
> general opinion about removing free page hinting in QEMU (and Linux)? We
> keep finding issues in the QEMU implementation, including non-trivial
> ones, and have to speculate about the actual semantics. I can see that
> e.g., libvirt does not support it yet.

Not maintaining two similar features sounds attractive.

I'm still trying to get my head around the list of issues.  So far they
all look kind of minor to me.  Would you like to summarize them
somewhere?
The appeal of hinting is that it's 0 overhead outside migration,
and pains were taken to avoid keeping pages locked while
hypervisor is busy.

If we are to drop hinting completely we need to show that reporting
can be comparable, and we'll probably want to add a mode for
reporting that behaves somewhat similarly.

-- 
MST



^ permalink raw reply	[flat|nested] 44+ messages in thread

* [virtio-dev] Re: [PATCH v25 QEMU 3/3] virtio-balloon: Replace free page hinting references to 'report' with 'hint'
@ 2020-06-18 16:09             ` Michael S. Tsirkin
  0 siblings, 0 replies; 44+ messages in thread
From: Michael S. Tsirkin @ 2020-06-18 16:09 UTC (permalink / raw)
  To: David Hildenbrand; +Cc: Alexander Duyck, virtio-dev, qemu-devel

On Thu, Jun 18, 2020 at 05:58:28PM +0200, David Hildenbrand wrote:
> On 18.06.20 17:14, Alexander Duyck wrote:
> > On Thu, Jun 18, 2020 at 5:54 AM David Hildenbrand <david@redhat.com> wrote:
> >>
> >> On 13.06.20 22:07, Alexander Duyck wrote:
> >>> On Tue, May 26, 2020 at 9:14 PM Alexander Duyck
> >>> <alexander.duyck@gmail.com> wrote:
> >>>>
> >>>> From: Alexander Duyck <alexander.h.duyck@linux.intel.com>
> >>>>
> >>>> In an upcoming patch a feature named Free Page Reporting is about to be
> >>>> added. In order to avoid any confusion we should drop the use of the word
> >>>> 'report' when referring to Free Page Hinting. So what this patch does is go
> >>>> through and replace all instances of 'report' with 'hint" when we are
> >>>> referring to free page hinting.
> >>>>
> >>>> Acked-by: David Hildenbrand <david@redhat.com>
> >>>> Signed-off-by: Alexander Duyck <alexander.h.duyck@linux.intel.com>
> >>>> ---
> >>>>  hw/virtio/virtio-balloon.c         |   78 ++++++++++++++++++------------------
> >>>>  include/hw/virtio/virtio-balloon.h |   20 +++++----
> >>>>  2 files changed, 49 insertions(+), 49 deletions(-)
> >>>>
> >>>> diff --git a/hw/virtio/virtio-balloon.c b/hw/virtio/virtio-balloon.c
> >>>> index 3e2ac1104b5f..dc15409b0bb6 100644
> >>>> --- a/hw/virtio/virtio-balloon.c
> >>>> +++ b/hw/virtio/virtio-balloon.c
> >>>
> >>> ...
> >>>
> >>>> @@ -817,14 +817,14 @@ static int virtio_balloon_post_load_device(void *opaque, int version_id)
> >>>>      return 0;
> >>>>  }
> >>>>
> >>>> -static const VMStateDescription vmstate_virtio_balloon_free_page_report = {
> >>>> +static const VMStateDescription vmstate_virtio_balloon_free_page_hint = {
> >>>>      .name = "virtio-balloon-device/free-page-report",
> >>>>      .version_id = 1,
> >>>>      .minimum_version_id = 1,
> >>>>      .needed = virtio_balloon_free_page_support,
> >>>>      .fields = (VMStateField[]) {
> >>>> -        VMSTATE_UINT32(free_page_report_cmd_id, VirtIOBalloon),
> >>>> -        VMSTATE_UINT32(free_page_report_status, VirtIOBalloon),
> >>>> +        VMSTATE_UINT32(free_page_hint_cmd_id, VirtIOBalloon),
> >>>> +        VMSTATE_UINT32(free_page_hint_status, VirtIOBalloon),
> >>>>          VMSTATE_END_OF_LIST()
> >>>>      }
> >>>>  };
> >>>
> >>> So I noticed this patch wasn't in the list of patches pulled, but that
> >>> is probably for the best since I believe the change above might have
> >>> broken migration as VMSTATE_UINT32 does a stringify on the first
> >>> parameter.
> >>
> >> Indeed, it's the name of the vmstate field. But I don't think it is
> >> relevant for migration. It's and indicator if a field is valid and it's
> >> used in traces/error messages.
> >>
> >> See git grep "field->name"
> >>
> >> I don't think renaming this is problematic. Can you rebase and resent?
> >> Thanks!
> > 
> > Okay, I will.
> > 
> >>> Any advice on how to address it, or should I just give up on renaming
> >>> free_page_report_cmd_id and free_page_report_status?
> >>>
> >>> Looking at this I wonder why we even need to migrate these values? It
> >>> seems like if we are completing a migration the cmd_id should always
> >>> be "DONE" shouldn't it? It isn't as if we are going to migrate the
> >>
> >> The *status* should be DONE IIUC. The cmd_id might be relevant, no? It's
> >> always incremented until it wraps.
> > 
> > The thing is, the cmd_id visible to the driver if the status is DONE
> > is the cmd_id value for DONE. So as long as the driver acknowledges
> > the value we could essentially start over the cmd_id without any
> > negative effect. The driver would have to put down a new descriptor to
> > start a block of hinting in order to begin reporting again so there
> > shouldn't be any risk of us falsely hinting pages that were in a
> > previous epoch.
> > 
> > Ugh, although now looking at it I think we might have a bug in the
> > QEMU code in that the driver could in theory force its way past a
> > "STOP" by just replaying the last command_id descriptor and then keep
> > going. Should be a pretty easy fix though as we should only allow a
> > transition to S_START if the status is S_REQUESTED/
> 
> Ugh, ...
> 
> @MST, you might have missed that in another discussion, what's your
> general opinion about removing free page hinting in QEMU (and Linux)? We
> keep finding issues in the QEMU implementation, including non-trivial
> ones, and have to speculate about the actual semantics. I can see that
> e.g., libvirt does not support it yet.

Not maintaining two similar features sounds attractive.

I'm still trying to get my head around the list of issues.  So far they
all look kind of minor to me.  Would you like to summarize them
somewhere?
The appeal of hinting is that it's 0 overhead outside migration,
and pains were taken to avoid keeping pages locked while
hypervisor is busy.

If we are to drop hinting completely we need to show that reporting
can be comparable, and we'll probably want to add a mode for
reporting that behaves somewhat similarly.

-- 
MST


---------------------------------------------------------------------
To unsubscribe, e-mail: virtio-dev-unsubscribe@lists.oasis-open.org
For additional commands, e-mail: virtio-dev-help@lists.oasis-open.org


^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [PATCH v25 QEMU 3/3] virtio-balloon: Replace free page hinting references to 'report' with 'hint'
  2020-06-18 16:09             ` [virtio-dev] " Michael S. Tsirkin
@ 2020-06-18 17:10               ` David Hildenbrand
  -1 siblings, 0 replies; 44+ messages in thread
From: David Hildenbrand @ 2020-06-18 17:10 UTC (permalink / raw)
  To: Michael S. Tsirkin; +Cc: virtio-dev, qemu-devel, Alexander Duyck

>>
>> Ugh, ...
>>
>> @MST, you might have missed that in another discussion, what's your
>> general opinion about removing free page hinting in QEMU (and Linux)? We
>> keep finding issues in the QEMU implementation, including non-trivial
>> ones, and have to speculate about the actual semantics. I can see that
>> e.g., libvirt does not support it yet.
> 
> Not maintaining two similar features sounds attractive.

I consider free page hinting (in QEMU) to be in an unmaintainable state
(and it looks like Alex and I are fixing a feature we don't actually
intend to use / not aware of users). In contrast to that, the free page
reporting functionality/implementation is a walk in the park.

> 
> I'm still trying to get my head around the list of issues.  So far they
> all look kind of minor to me.  Would you like to summarize them
> somewhere?

Some things I still have in my mind


1. If migration fails during RAM precopy, the guest will never receive a
DONE notification. Probably easy to fix.

2. Unclear semantics. Alex tried to document what the actual semantics
of hinted pages are. Assume the following in the guest to a previously
hinted page

/* page was hinted and is reused now */
if (page[x] != Y)
	page[x] == Y;
/* migration ends, we now run on the destination */
BUG_ON(page[x] != Y);
/* BUG, because the content chan

A guest can observe that. And that could be a random driver that just
allocated a page.

(I *assume* in Linux we might catch that using kasan, but I am not 100%
sure, also, the actual semantics to document are unclear - e.g., for
other guests)

As Alex mentioned, it is not even guaranteed in QEMU that we receive a
zero page on the destination, it could also be something else (e.g.,
previously migrated values).

3. If I am not wrong, the iothread works in
virtio_ballloon_get_free_page_hints() on the virtqueue only with holding
the free_page_lock (no BQL).

Assume we're migrating, the iothread is active, and the guest triggers a
device reset.

virtio_balloon_device_reset() will trigger a
virtio_balloon_free_page_stop(s). That won't actually wait for the
iothread to stop, it will only temporarily lock free_page_lock and
update s->free_page_report_status.

I think there can be a race between the device reset and the iothread.
Once virtio_balloon_free_page_stop() returned,
virtio_ballloon_get_free_page_hints() can still call
- virtio_queue_set_notification(vq, 0);
- virtio_queue_set_notification(vq, 1);
- virtio_notify(vdev, vq);
- virtqueue_pop()

I doubt this is very nice.

There are other concerns I had regarding the iothread (e.g., while
reporting is active, virtio_ballloon_get_free_page_hints() is
essentially a busy loop, in contrast to documented -
continue_to_get_hints will always be true).

> The appeal of hinting is that it's 0 overhead outside migration,
> and pains were taken to avoid keeping pages locked while
> hypervisor is busy.
> 
> If we are to drop hinting completely we need to show that reporting
> can be comparable, and we'll probably want to add a mode for
> reporting that behaves somewhat similarly.

Depends on the actual users. If we're dropping a feature that nobody is
actively using, I don't think we have to show anything.

This feature obviously saw no proper review.

-- 
Thanks,

David / dhildenb



^ permalink raw reply	[flat|nested] 44+ messages in thread

* [virtio-dev] Re: [PATCH v25 QEMU 3/3] virtio-balloon: Replace free page hinting references to 'report' with 'hint'
@ 2020-06-18 17:10               ` David Hildenbrand
  0 siblings, 0 replies; 44+ messages in thread
From: David Hildenbrand @ 2020-06-18 17:10 UTC (permalink / raw)
  To: Michael S. Tsirkin; +Cc: Alexander Duyck, virtio-dev, qemu-devel

>>
>> Ugh, ...
>>
>> @MST, you might have missed that in another discussion, what's your
>> general opinion about removing free page hinting in QEMU (and Linux)? We
>> keep finding issues in the QEMU implementation, including non-trivial
>> ones, and have to speculate about the actual semantics. I can see that
>> e.g., libvirt does not support it yet.
> 
> Not maintaining two similar features sounds attractive.

I consider free page hinting (in QEMU) to be in an unmaintainable state
(and it looks like Alex and I are fixing a feature we don't actually
intend to use / not aware of users). In contrast to that, the free page
reporting functionality/implementation is a walk in the park.

> 
> I'm still trying to get my head around the list of issues.  So far they
> all look kind of minor to me.  Would you like to summarize them
> somewhere?

Some things I still have in my mind


1. If migration fails during RAM precopy, the guest will never receive a
DONE notification. Probably easy to fix.

2. Unclear semantics. Alex tried to document what the actual semantics
of hinted pages are. Assume the following in the guest to a previously
hinted page

/* page was hinted and is reused now */
if (page[x] != Y)
	page[x] == Y;
/* migration ends, we now run on the destination */
BUG_ON(page[x] != Y);
/* BUG, because the content chan

A guest can observe that. And that could be a random driver that just
allocated a page.

(I *assume* in Linux we might catch that using kasan, but I am not 100%
sure, also, the actual semantics to document are unclear - e.g., for
other guests)

As Alex mentioned, it is not even guaranteed in QEMU that we receive a
zero page on the destination, it could also be something else (e.g.,
previously migrated values).

3. If I am not wrong, the iothread works in
virtio_ballloon_get_free_page_hints() on the virtqueue only with holding
the free_page_lock (no BQL).

Assume we're migrating, the iothread is active, and the guest triggers a
device reset.

virtio_balloon_device_reset() will trigger a
virtio_balloon_free_page_stop(s). That won't actually wait for the
iothread to stop, it will only temporarily lock free_page_lock and
update s->free_page_report_status.

I think there can be a race between the device reset and the iothread.
Once virtio_balloon_free_page_stop() returned,
virtio_ballloon_get_free_page_hints() can still call
- virtio_queue_set_notification(vq, 0);
- virtio_queue_set_notification(vq, 1);
- virtio_notify(vdev, vq);
- virtqueue_pop()

I doubt this is very nice.

There are other concerns I had regarding the iothread (e.g., while
reporting is active, virtio_ballloon_get_free_page_hints() is
essentially a busy loop, in contrast to documented -
continue_to_get_hints will always be true).

> The appeal of hinting is that it's 0 overhead outside migration,
> and pains were taken to avoid keeping pages locked while
> hypervisor is busy.
> 
> If we are to drop hinting completely we need to show that reporting
> can be comparable, and we'll probably want to add a mode for
> reporting that behaves somewhat similarly.

Depends on the actual users. If we're dropping a feature that nobody is
actively using, I don't think we have to show anything.

This feature obviously saw no proper review.

-- 
Thanks,

David / dhildenb


---------------------------------------------------------------------
To unsubscribe, e-mail: virtio-dev-unsubscribe@lists.oasis-open.org
For additional commands, e-mail: virtio-dev-help@lists.oasis-open.org


^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [PATCH v25 QEMU 3/3] virtio-balloon: Replace free page hinting references to 'report' with 'hint'
  2020-06-18 17:10               ` [virtio-dev] " David Hildenbrand
@ 2020-06-18 17:20                 ` David Hildenbrand
  -1 siblings, 0 replies; 44+ messages in thread
From: David Hildenbrand @ 2020-06-18 17:20 UTC (permalink / raw)
  To: Michael S. Tsirkin; +Cc: virtio-dev, qemu-devel, Alexander Duyck

 > There are other concerns I had regarding the iothread (e.g., while
> reporting is active, virtio_ballloon_get_free_page_hints() is
> essentially a busy loop, in contrast to documented -
> continue_to_get_hints will always be true).

FWIW, I just double checked this and my memory was bad.

 -
-- 
Thanks,

David / dhildenb



^ permalink raw reply	[flat|nested] 44+ messages in thread

* [virtio-dev] Re: [PATCH v25 QEMU 3/3] virtio-balloon: Replace free page hinting references to 'report' with 'hint'
@ 2020-06-18 17:20                 ` David Hildenbrand
  0 siblings, 0 replies; 44+ messages in thread
From: David Hildenbrand @ 2020-06-18 17:20 UTC (permalink / raw)
  To: Michael S. Tsirkin; +Cc: Alexander Duyck, virtio-dev, qemu-devel

 > There are other concerns I had regarding the iothread (e.g., while
> reporting is active, virtio_ballloon_get_free_page_hints() is
> essentially a busy loop, in contrast to documented -
> continue_to_get_hints will always be true).

FWIW, I just double checked this and my memory was bad.

 -
-- 
Thanks,

David / dhildenb


---------------------------------------------------------------------
To unsubscribe, e-mail: virtio-dev-unsubscribe@lists.oasis-open.org
For additional commands, e-mail: virtio-dev-help@lists.oasis-open.org


^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [virtio-dev] Re: [PATCH v25 QEMU 3/3] virtio-balloon: Replace free page hinting references to 'report' with 'hint'
  2020-06-13 20:07     ` [virtio-dev] " Alexander Duyck
@ 2020-06-18 19:00       ` Dr. David Alan Gilbert
  -1 siblings, 0 replies; 44+ messages in thread
From: Dr. David Alan Gilbert @ 2020-06-18 19:00 UTC (permalink / raw)
  To: Alexander Duyck
  Cc: virtio-dev, Michael S. Tsirkin, qemu-devel, David Hildenbrand

* Alexander Duyck (alexander.duyck@gmail.com) wrote:
> On Tue, May 26, 2020 at 9:14 PM Alexander Duyck
> <alexander.duyck@gmail.com> wrote:
> >
> > From: Alexander Duyck <alexander.h.duyck@linux.intel.com>
> >
> > In an upcoming patch a feature named Free Page Reporting is about to be
> > added. In order to avoid any confusion we should drop the use of the word
> > 'report' when referring to Free Page Hinting. So what this patch does is go
> > through and replace all instances of 'report' with 'hint" when we are
> > referring to free page hinting.
> >
> > Acked-by: David Hildenbrand <david@redhat.com>
> > Signed-off-by: Alexander Duyck <alexander.h.duyck@linux.intel.com>
> > ---
> >  hw/virtio/virtio-balloon.c         |   78 ++++++++++++++++++------------------
> >  include/hw/virtio/virtio-balloon.h |   20 +++++----
> >  2 files changed, 49 insertions(+), 49 deletions(-)
> >
> > diff --git a/hw/virtio/virtio-balloon.c b/hw/virtio/virtio-balloon.c
> > index 3e2ac1104b5f..dc15409b0bb6 100644
> > --- a/hw/virtio/virtio-balloon.c
> > +++ b/hw/virtio/virtio-balloon.c
> 
> ...
> 
> > @@ -817,14 +817,14 @@ static int virtio_balloon_post_load_device(void *opaque, int version_id)
> >      return 0;
> >  }
> >
> > -static const VMStateDescription vmstate_virtio_balloon_free_page_report = {
> > +static const VMStateDescription vmstate_virtio_balloon_free_page_hint = {
> >      .name = "virtio-balloon-device/free-page-report",
> >      .version_id = 1,
> >      .minimum_version_id = 1,
> >      .needed = virtio_balloon_free_page_support,
> >      .fields = (VMStateField[]) {
> > -        VMSTATE_UINT32(free_page_report_cmd_id, VirtIOBalloon),
> > -        VMSTATE_UINT32(free_page_report_status, VirtIOBalloon),
> > +        VMSTATE_UINT32(free_page_hint_cmd_id, VirtIOBalloon),
> > +        VMSTATE_UINT32(free_page_hint_status, VirtIOBalloon),
> >          VMSTATE_END_OF_LIST()
> >      }
> >  };
> 
> So I noticed this patch wasn't in the list of patches pulled, but that
> is probably for the best since I believe the change above might have
> broken migration as VMSTATE_UINT32 does a stringify on the first
> parameter.
> Any advice on how to address it, or should I just give up on renaming
> free_page_report_cmd_id and free_page_report_status?

The filed names never hit the wire; the migration format is trivial
binary, especially of things like integers - that lands as just 4 bytes
on the wire [ hopefully in the place where the destination expects to
receive them ].
You need to be careful of the names of top level vmstate devices, and
the names of subsections; I don't think any other naming is in the
stream.
(We've even done hacks in the past of converting a VMSTATE_UINT32 to a
pair of UINT16 )

Dave

> Looking at this I wonder why we even need to migrate these values? It
> seems like if we are completing a migration the cmd_id should always
> be "DONE" shouldn't it? It isn't as if we are going to migrate the
> hinting from one host to another. We will have to start over which is
> essentially the signal that the "DONE" value provides. Same thing for
> the status. We shouldn't be able to migrate unless both of these are
> already in the "DONE" state so if anything I wonder if we shouldn't
> have that as the initial state for the device and just drop the
> migration info.
> 
> Thanks.
> 
> - Alex
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: virtio-dev-unsubscribe@lists.oasis-open.org
> For additional commands, e-mail: virtio-dev-help@lists.oasis-open.org
> 
--
Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK



^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [virtio-dev] Re: [PATCH v25 QEMU 3/3] virtio-balloon: Replace free page hinting references to 'report' with 'hint'
@ 2020-06-18 19:00       ` Dr. David Alan Gilbert
  0 siblings, 0 replies; 44+ messages in thread
From: Dr. David Alan Gilbert @ 2020-06-18 19:00 UTC (permalink / raw)
  To: Alexander Duyck
  Cc: David Hildenbrand, Michael S. Tsirkin, virtio-dev, qemu-devel

* Alexander Duyck (alexander.duyck@gmail.com) wrote:
> On Tue, May 26, 2020 at 9:14 PM Alexander Duyck
> <alexander.duyck@gmail.com> wrote:
> >
> > From: Alexander Duyck <alexander.h.duyck@linux.intel.com>
> >
> > In an upcoming patch a feature named Free Page Reporting is about to be
> > added. In order to avoid any confusion we should drop the use of the word
> > 'report' when referring to Free Page Hinting. So what this patch does is go
> > through and replace all instances of 'report' with 'hint" when we are
> > referring to free page hinting.
> >
> > Acked-by: David Hildenbrand <david@redhat.com>
> > Signed-off-by: Alexander Duyck <alexander.h.duyck@linux.intel.com>
> > ---
> >  hw/virtio/virtio-balloon.c         |   78 ++++++++++++++++++------------------
> >  include/hw/virtio/virtio-balloon.h |   20 +++++----
> >  2 files changed, 49 insertions(+), 49 deletions(-)
> >
> > diff --git a/hw/virtio/virtio-balloon.c b/hw/virtio/virtio-balloon.c
> > index 3e2ac1104b5f..dc15409b0bb6 100644
> > --- a/hw/virtio/virtio-balloon.c
> > +++ b/hw/virtio/virtio-balloon.c
> 
> ...
> 
> > @@ -817,14 +817,14 @@ static int virtio_balloon_post_load_device(void *opaque, int version_id)
> >      return 0;
> >  }
> >
> > -static const VMStateDescription vmstate_virtio_balloon_free_page_report = {
> > +static const VMStateDescription vmstate_virtio_balloon_free_page_hint = {
> >      .name = "virtio-balloon-device/free-page-report",
> >      .version_id = 1,
> >      .minimum_version_id = 1,
> >      .needed = virtio_balloon_free_page_support,
> >      .fields = (VMStateField[]) {
> > -        VMSTATE_UINT32(free_page_report_cmd_id, VirtIOBalloon),
> > -        VMSTATE_UINT32(free_page_report_status, VirtIOBalloon),
> > +        VMSTATE_UINT32(free_page_hint_cmd_id, VirtIOBalloon),
> > +        VMSTATE_UINT32(free_page_hint_status, VirtIOBalloon),
> >          VMSTATE_END_OF_LIST()
> >      }
> >  };
> 
> So I noticed this patch wasn't in the list of patches pulled, but that
> is probably for the best since I believe the change above might have
> broken migration as VMSTATE_UINT32 does a stringify on the first
> parameter.
> Any advice on how to address it, or should I just give up on renaming
> free_page_report_cmd_id and free_page_report_status?

The filed names never hit the wire; the migration format is trivial
binary, especially of things like integers - that lands as just 4 bytes
on the wire [ hopefully in the place where the destination expects to
receive them ].
You need to be careful of the names of top level vmstate devices, and
the names of subsections; I don't think any other naming is in the
stream.
(We've even done hacks in the past of converting a VMSTATE_UINT32 to a
pair of UINT16 )

Dave

> Looking at this I wonder why we even need to migrate these values? It
> seems like if we are completing a migration the cmd_id should always
> be "DONE" shouldn't it? It isn't as if we are going to migrate the
> hinting from one host to another. We will have to start over which is
> essentially the signal that the "DONE" value provides. Same thing for
> the status. We shouldn't be able to migrate unless both of these are
> already in the "DONE" state so if anything I wonder if we shouldn't
> have that as the initial state for the device and just drop the
> migration info.
> 
> Thanks.
> 
> - Alex
> 
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: virtio-dev-unsubscribe@lists.oasis-open.org
> For additional commands, e-mail: virtio-dev-help@lists.oasis-open.org
> 
--
Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK


---------------------------------------------------------------------
To unsubscribe, e-mail: virtio-dev-unsubscribe@lists.oasis-open.org
For additional commands, e-mail: virtio-dev-help@lists.oasis-open.org


^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [PATCH v25 QEMU 3/3] virtio-balloon: Replace free page hinting references to 'report' with 'hint'
  2020-06-18 17:10               ` [virtio-dev] " David Hildenbrand
@ 2020-06-18 19:45                 ` Alexander Duyck
  -1 siblings, 0 replies; 44+ messages in thread
From: Alexander Duyck @ 2020-06-18 19:45 UTC (permalink / raw)
  To: David Hildenbrand, Wang, Wei W; +Cc: virtio-dev, qemu-devel, Michael S. Tsirkin

On Thu, Jun 18, 2020 at 10:10 AM David Hildenbrand <david@redhat.com> wrote:
>
> >>
> >> Ugh, ...
> >>
> >> @MST, you might have missed that in another discussion, what's your
> >> general opinion about removing free page hinting in QEMU (and Linux)? We
> >> keep finding issues in the QEMU implementation, including non-trivial
> >> ones, and have to speculate about the actual semantics. I can see that
> >> e.g., libvirt does not support it yet.
> >
> > Not maintaining two similar features sounds attractive.

Agreed. Just to make sure we are all on the same page I am adding Wei
Wang since he was the original author for page hinting.

> I consider free page hinting (in QEMU) to be in an unmaintainable state
> (and it looks like Alex and I are fixing a feature we don't actually
> intend to use / not aware of users). In contrast to that, the free page
> reporting functionality/implementation is a walk in the park.
>
> >
> > I'm still trying to get my head around the list of issues.  So far they
> > all look kind of minor to me.  Would you like to summarize them
> > somewhere?
>
> Some things I still have in my mind
>
>
> 1. If migration fails during RAM precopy, the guest will never receive a
> DONE notification. Probably easy to fix.

Agreed. It is just a matter of finding the right point to add a hook
so that if we abort the migration we can report DONE.

> 2. Unclear semantics. Alex tried to document what the actual semantics
> of hinted pages are. Assume the following in the guest to a previously
> hinted page
>
> /* page was hinted and is reused now */
> if (page[x] != Y)
>         page[x] == Y;
> /* migration ends, we now run on the destination */
> BUG_ON(page[x] != Y);
> /* BUG, because the content chan
>
> A guest can observe that. And that could be a random driver that just
> allocated a page.
>
> (I *assume* in Linux we might catch that using kasan, but I am not 100%
> sure, also, the actual semantics to document are unclear - e.g., for
> other guests)
>
> As Alex mentioned, it is not even guaranteed in QEMU that we receive a
> zero page on the destination, it could also be something else (e.g.,
> previously migrated values).

So this is only an issue for pages that are pushed out of the balloon
as a part of the shrinker process though. So fixing it would be pretty
straightforward as we would just have to initialize or at least dirty
pages that are leaked as a part of the shrinker. That may have an
impact on performance though as it would result in us dirtying pages
that are freed as a result of the shrinker being triggered.

> 3. If I am not wrong, the iothread works in
> virtio_ballloon_get_free_page_hints() on the virtqueue only with holding
> the free_page_lock (no BQL).
>
> Assume we're migrating, the iothread is active, and the guest triggers a
> device reset.
>
> virtio_balloon_device_reset() will trigger a
> virtio_balloon_free_page_stop(s). That won't actually wait for the
> iothread to stop, it will only temporarily lock free_page_lock and
> update s->free_page_report_status.
>
> I think there can be a race between the device reset and the iothread.
> Once virtio_balloon_free_page_stop() returned,
> virtio_ballloon_get_free_page_hints() can still call
> - virtio_queue_set_notification(vq, 0);
> - virtio_queue_set_notification(vq, 1);
> - virtio_notify(vdev, vq);
> - virtqueue_pop()
>
> I doubt this is very nice.

And our conversation had me start looking though reference to
virtio_balloon_free_page_stop. It looks like we call it for when we
unrealize the device or reset the device. It might make more sense for
us to look at pushing the status to DONE and forcing the iothread to
be flushed out.

> There are other concerns I had regarding the iothread (e.g., while
> reporting is active, virtio_ballloon_get_free_page_hints() is
> essentially a busy loop, in contrast to documented -
> continue_to_get_hints will always be true).
>
> > The appeal of hinting is that it's 0 overhead outside migration,
> > and pains were taken to avoid keeping pages locked while
> > hypervisor is busy.
> >
> > If we are to drop hinting completely we need to show that reporting
> > can be comparable, and we'll probably want to add a mode for
> > reporting that behaves somewhat similarly.
>
> Depends on the actual users. If we're dropping a feature that nobody is
> actively using, I don't think we have to show anything.
>
> This feature obviously saw no proper review.

I'm pretty sure it had some, as it went through several iterations as
I recall. However I don't think the review of the virtio interface was
very detailed as I think most of the attention was on the kernel
interface.

As far as trying to do this with page reporting it would be doable,
but I would need to use something like the command interface so that I
would have a way to tell the driver when to drop the reported bit from
the pages and when to stop/resume hinting. However it still wouldn't
resolve the issue of copy on write style pages where the page is only
read and never updated. I honestly wonder if it wouldn't be better to
just apply the same logic that I have been doing with page reporting
and migrate a zero page for a hinted page and MADVISE the page away so
that it should be zero on the origin. Then you get the benefit of the
guest shrinking as you perform the migration and would only be
transmitting zero pages to the other side which should be
significantly smaller if I am not mistaken.


^ permalink raw reply	[flat|nested] 44+ messages in thread

* [virtio-dev] Re: [PATCH v25 QEMU 3/3] virtio-balloon: Replace free page hinting references to 'report' with 'hint'
@ 2020-06-18 19:45                 ` Alexander Duyck
  0 siblings, 0 replies; 44+ messages in thread
From: Alexander Duyck @ 2020-06-18 19:45 UTC (permalink / raw)
  To: David Hildenbrand, Wang, Wei W; +Cc: Michael S. Tsirkin, virtio-dev, qemu-devel

On Thu, Jun 18, 2020 at 10:10 AM David Hildenbrand <david@redhat.com> wrote:
>
> >>
> >> Ugh, ...
> >>
> >> @MST, you might have missed that in another discussion, what's your
> >> general opinion about removing free page hinting in QEMU (and Linux)? We
> >> keep finding issues in the QEMU implementation, including non-trivial
> >> ones, and have to speculate about the actual semantics. I can see that
> >> e.g., libvirt does not support it yet.
> >
> > Not maintaining two similar features sounds attractive.

Agreed. Just to make sure we are all on the same page I am adding Wei
Wang since he was the original author for page hinting.

> I consider free page hinting (in QEMU) to be in an unmaintainable state
> (and it looks like Alex and I are fixing a feature we don't actually
> intend to use / not aware of users). In contrast to that, the free page
> reporting functionality/implementation is a walk in the park.
>
> >
> > I'm still trying to get my head around the list of issues.  So far they
> > all look kind of minor to me.  Would you like to summarize them
> > somewhere?
>
> Some things I still have in my mind
>
>
> 1. If migration fails during RAM precopy, the guest will never receive a
> DONE notification. Probably easy to fix.

Agreed. It is just a matter of finding the right point to add a hook
so that if we abort the migration we can report DONE.

> 2. Unclear semantics. Alex tried to document what the actual semantics
> of hinted pages are. Assume the following in the guest to a previously
> hinted page
>
> /* page was hinted and is reused now */
> if (page[x] != Y)
>         page[x] == Y;
> /* migration ends, we now run on the destination */
> BUG_ON(page[x] != Y);
> /* BUG, because the content chan
>
> A guest can observe that. And that could be a random driver that just
> allocated a page.
>
> (I *assume* in Linux we might catch that using kasan, but I am not 100%
> sure, also, the actual semantics to document are unclear - e.g., for
> other guests)
>
> As Alex mentioned, it is not even guaranteed in QEMU that we receive a
> zero page on the destination, it could also be something else (e.g.,
> previously migrated values).

So this is only an issue for pages that are pushed out of the balloon
as a part of the shrinker process though. So fixing it would be pretty
straightforward as we would just have to initialize or at least dirty
pages that are leaked as a part of the shrinker. That may have an
impact on performance though as it would result in us dirtying pages
that are freed as a result of the shrinker being triggered.

> 3. If I am not wrong, the iothread works in
> virtio_ballloon_get_free_page_hints() on the virtqueue only with holding
> the free_page_lock (no BQL).
>
> Assume we're migrating, the iothread is active, and the guest triggers a
> device reset.
>
> virtio_balloon_device_reset() will trigger a
> virtio_balloon_free_page_stop(s). That won't actually wait for the
> iothread to stop, it will only temporarily lock free_page_lock and
> update s->free_page_report_status.
>
> I think there can be a race between the device reset and the iothread.
> Once virtio_balloon_free_page_stop() returned,
> virtio_ballloon_get_free_page_hints() can still call
> - virtio_queue_set_notification(vq, 0);
> - virtio_queue_set_notification(vq, 1);
> - virtio_notify(vdev, vq);
> - virtqueue_pop()
>
> I doubt this is very nice.

And our conversation had me start looking though reference to
virtio_balloon_free_page_stop. It looks like we call it for when we
unrealize the device or reset the device. It might make more sense for
us to look at pushing the status to DONE and forcing the iothread to
be flushed out.

> There are other concerns I had regarding the iothread (e.g., while
> reporting is active, virtio_ballloon_get_free_page_hints() is
> essentially a busy loop, in contrast to documented -
> continue_to_get_hints will always be true).
>
> > The appeal of hinting is that it's 0 overhead outside migration,
> > and pains were taken to avoid keeping pages locked while
> > hypervisor is busy.
> >
> > If we are to drop hinting completely we need to show that reporting
> > can be comparable, and we'll probably want to add a mode for
> > reporting that behaves somewhat similarly.
>
> Depends on the actual users. If we're dropping a feature that nobody is
> actively using, I don't think we have to show anything.
>
> This feature obviously saw no proper review.

I'm pretty sure it had some, as it went through several iterations as
I recall. However I don't think the review of the virtio interface was
very detailed as I think most of the attention was on the kernel
interface.

As far as trying to do this with page reporting it would be doable,
but I would need to use something like the command interface so that I
would have a way to tell the driver when to drop the reported bit from
the pages and when to stop/resume hinting. However it still wouldn't
resolve the issue of copy on write style pages where the page is only
read and never updated. I honestly wonder if it wouldn't be better to
just apply the same logic that I have been doing with page reporting
and migrate a zero page for a hinted page and MADVISE the page away so
that it should be zero on the origin. Then you get the benefit of the
guest shrinking as you perform the migration and would only be
transmitting zero pages to the other side which should be
significantly smaller if I am not mistaken.

---------------------------------------------------------------------
To unsubscribe, e-mail: virtio-dev-unsubscribe@lists.oasis-open.org
For additional commands, e-mail: virtio-dev-help@lists.oasis-open.org


^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [PATCH v25 QEMU 3/3] virtio-balloon: Replace free page hinting references to 'report' with 'hint'
  2020-06-18 19:45                 ` [virtio-dev] " Alexander Duyck
@ 2020-06-18 20:20                   ` David Hildenbrand
  -1 siblings, 0 replies; 44+ messages in thread
From: David Hildenbrand @ 2020-06-18 20:20 UTC (permalink / raw)
  To: Alexander Duyck
  Cc: virtio-dev, Wang, Wei W, Michael S. Tsirkin, qemu-devel,
	David Hildenbrand

> 
>> 2. Unclear semantics. Alex tried to document what the actual semantics
>> of hinted pages are. Assume the following in the guest to a previously
>> hinted page
>> 
>> /* page was hinted and is reused now */
>> if (page[x] != Y)
>>        page[x] == Y;
>> /* migration ends, we now run on the destination */
>> BUG_ON(page[x] != Y);
>> /* BUG, because the content chan
>> 
>> A guest can observe that. And that could be a random driver that just
>> allocated a page.
>> 
>> (I *assume* in Linux we might catch that using kasan, but I am not 100%
>> sure, also, the actual semantics to document are unclear - e.g., for
>> other guests)
>> 
>> As Alex mentioned, it is not even guaranteed in QEMU that we receive a
>> zero page on the destination, it could also be something else (e.g.,
>> previously migrated values).
> 
> So this is only an issue for pages that are pushed out of the balloon
> as a part of the shrinker process though. So fixing it would be pretty
> straightforward as we would just have to initialize or at least dirty
> pages that are leaked as a part of the shrinker. That may have an
> impact on performance though as it would result in us dirtying pages
> that are freed as a result of the shrinker being triggered.
> 

It really depends on the desired semantics, which are unclear because there is no doc/spec. Either QEMU is buggy or the kernel is buggy.

>> 3. If I am not wrong, the iothread works in
>> virtio_ballloon_get_free_page_hints() on the virtqueue only with holding
>> the free_page_lock (no BQL).
>> 
>> Assume we're migrating, the iothread is active, and the guest triggers a
>> device reset.
>> 
>> virtio_balloon_device_reset() will trigger a
>> virtio_balloon_free_page_stop(s). That won't actually wait for the
>> iothread to stop, it will only temporarily lock free_page_lock and
>> update s->free_page_report_status.
>> 
>> I think there can be a race between the device reset and the iothread.
>> Once virtio_balloon_free_page_stop() returned,
>> virtio_ballloon_get_free_page_hints() can still call
>> - virtio_queue_set_notification(vq, 0);
>> - virtio_queue_set_notification(vq, 1);
>> - virtio_notify(vdev, vq);
>> - virtqueue_pop()
>> 
>> I doubt this is very nice.
> 
> And our conversation had me start looking though reference to
> virtio_balloon_free_page_stop. It looks like we call it for when we
> unrealize the device or reset the device. It might make more sense for
> us to look at pushing the status to DONE and forcing the iothread to
> be flushed out.
> 
>> There are other concerns I had regarding the iothread (e.g., while
>> reporting is active, virtio_ballloon_get_free_page_hints() is
>> essentially a busy loop, in contrast to documented -
>> continue_to_get_hints will always be true).
>> 
>>> The appeal of hinting is that it's 0 overhead outside migration,
>>> and pains were taken to avoid keeping pages locked while
>>> hypervisor is busy.
>>> 
>>> If we are to drop hinting completely we need to show that reporting
>>> can be comparable, and we'll probably want to add a mode for
>>> reporting that behaves somewhat similarly.
>> 
>> Depends on the actual users. If we're dropping a feature that nobody is
>> actively using, I don't think we have to show anything.
>> 
>> This feature obviously saw no proper review.
> 
> I'm pretty sure it had some, as it went through several iterations as
> I recall. However I don't think the review of the virtio interface was
> very detailed as I think most of the attention was on the kernel
> interface.

Yes, that‘s what I meant. The kernel side and the migration code (QEMU) got a lot of attention.



^ permalink raw reply	[flat|nested] 44+ messages in thread

* [virtio-dev] Re: [PATCH v25 QEMU 3/3] virtio-balloon: Replace free page hinting references to 'report' with 'hint'
@ 2020-06-18 20:20                   ` David Hildenbrand
  0 siblings, 0 replies; 44+ messages in thread
From: David Hildenbrand @ 2020-06-18 20:20 UTC (permalink / raw)
  To: Alexander Duyck
  Cc: David Hildenbrand, Wang, Wei W, Michael S. Tsirkin, virtio-dev,
	qemu-devel

> 
>> 2. Unclear semantics. Alex tried to document what the actual semantics
>> of hinted pages are. Assume the following in the guest to a previously
>> hinted page
>> 
>> /* page was hinted and is reused now */
>> if (page[x] != Y)
>>        page[x] == Y;
>> /* migration ends, we now run on the destination */
>> BUG_ON(page[x] != Y);
>> /* BUG, because the content chan
>> 
>> A guest can observe that. And that could be a random driver that just
>> allocated a page.
>> 
>> (I *assume* in Linux we might catch that using kasan, but I am not 100%
>> sure, also, the actual semantics to document are unclear - e.g., for
>> other guests)
>> 
>> As Alex mentioned, it is not even guaranteed in QEMU that we receive a
>> zero page on the destination, it could also be something else (e.g.,
>> previously migrated values).
> 
> So this is only an issue for pages that are pushed out of the balloon
> as a part of the shrinker process though. So fixing it would be pretty
> straightforward as we would just have to initialize or at least dirty
> pages that are leaked as a part of the shrinker. That may have an
> impact on performance though as it would result in us dirtying pages
> that are freed as a result of the shrinker being triggered.
> 

It really depends on the desired semantics, which are unclear because there is no doc/spec. Either QEMU is buggy or the kernel is buggy.

>> 3. If I am not wrong, the iothread works in
>> virtio_ballloon_get_free_page_hints() on the virtqueue only with holding
>> the free_page_lock (no BQL).
>> 
>> Assume we're migrating, the iothread is active, and the guest triggers a
>> device reset.
>> 
>> virtio_balloon_device_reset() will trigger a
>> virtio_balloon_free_page_stop(s). That won't actually wait for the
>> iothread to stop, it will only temporarily lock free_page_lock and
>> update s->free_page_report_status.
>> 
>> I think there can be a race between the device reset and the iothread.
>> Once virtio_balloon_free_page_stop() returned,
>> virtio_ballloon_get_free_page_hints() can still call
>> - virtio_queue_set_notification(vq, 0);
>> - virtio_queue_set_notification(vq, 1);
>> - virtio_notify(vdev, vq);
>> - virtqueue_pop()
>> 
>> I doubt this is very nice.
> 
> And our conversation had me start looking though reference to
> virtio_balloon_free_page_stop. It looks like we call it for when we
> unrealize the device or reset the device. It might make more sense for
> us to look at pushing the status to DONE and forcing the iothread to
> be flushed out.
> 
>> There are other concerns I had regarding the iothread (e.g., while
>> reporting is active, virtio_ballloon_get_free_page_hints() is
>> essentially a busy loop, in contrast to documented -
>> continue_to_get_hints will always be true).
>> 
>>> The appeal of hinting is that it's 0 overhead outside migration,
>>> and pains were taken to avoid keeping pages locked while
>>> hypervisor is busy.
>>> 
>>> If we are to drop hinting completely we need to show that reporting
>>> can be comparable, and we'll probably want to add a mode for
>>> reporting that behaves somewhat similarly.
>> 
>> Depends on the actual users. If we're dropping a feature that nobody is
>> actively using, I don't think we have to show anything.
>> 
>> This feature obviously saw no proper review.
> 
> I'm pretty sure it had some, as it went through several iterations as
> I recall. However I don't think the review of the virtio interface was
> very detailed as I think most of the attention was on the kernel
> interface.

Yes, that‘s what I meant. The kernel side and the migration code (QEMU) got a lot of attention.


---------------------------------------------------------------------
To unsubscribe, e-mail: virtio-dev-unsubscribe@lists.oasis-open.org
For additional commands, e-mail: virtio-dev-help@lists.oasis-open.org


^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [PATCH v25 QEMU 3/3] virtio-balloon: Replace free page hinting references to 'report' with 'hint'
  2020-06-18 17:10               ` [virtio-dev] " David Hildenbrand
@ 2020-06-24 14:56                 ` Michael S. Tsirkin
  -1 siblings, 0 replies; 44+ messages in thread
From: Michael S. Tsirkin @ 2020-06-24 14:56 UTC (permalink / raw)
  To: David Hildenbrand; +Cc: virtio-dev, qemu-devel, Alexander Duyck

On Thu, Jun 18, 2020 at 07:10:43PM +0200, David Hildenbrand wrote:
> >>
> >> Ugh, ...
> >>
> >> @MST, you might have missed that in another discussion, what's your
> >> general opinion about removing free page hinting in QEMU (and Linux)? We
> >> keep finding issues in the QEMU implementation, including non-trivial
> >> ones, and have to speculate about the actual semantics. I can see that
> >> e.g., libvirt does not support it yet.
> > 
> > Not maintaining two similar features sounds attractive.
> 
> I consider free page hinting (in QEMU) to be in an unmaintainable state
> (and it looks like Alex and I are fixing a feature we don't actually
> intend to use / not aware of users). In contrast to that, the free page
> reporting functionality/implementation is a walk in the park.

So at the high level the idea was simple, we just clear the dirty bit
when page is hinted, unless we sent a new command since. Implementation
was reviewed by migration maintainers. If there's a consensus the code
is written so badly we can't maintain it, maybe we should remove it.
Which parts are unmaintainable in your eyes - migration or virtio ones?
Or maybe it's the general thing that interface was never specced
properly.

> > 
> > I'm still trying to get my head around the list of issues.  So far they
> > all look kind of minor to me.  Would you like to summarize them
> > somewhere?
> 
> Some things I still have in my mind

Thanks for the summary!

> 
> 1. If migration fails during RAM precopy, the guest will never receive a
> DONE notification. Probably easy to fix.
> 
> 2. Unclear semantics. Alex tried to document what the actual semantics
> of hinted pages are.

I'll reply to that now.

> Assume the following in the guest to a previously
> hinted page
> 
> /* page was hinted and is reused now */
> if (page[x] != Y)
> 	page[x] == Y;
> /* migration ends, we now run on the destination */
> BUG_ON(page[x] != Y);
> /* BUG, because the content chan

The assumption hinting makes is that data in page is writtent to before it's used.


> A guest can observe that. And that could be a random driver that just
> allocated a page.
> 
> (I *assume* in Linux we might catch that using kasan, but I am not 100%
> sure, also, the actual semantics to document are unclear - e.g., for
> other guests)

I think it's basically simple: hinting means it's ok to
fill page with trash unless it has been modified since the command
ID supplied.

> As Alex mentioned, it is not even guaranteed in QEMU that we receive a
> zero page on the destination, it could also be something else (e.g.,
> previously migrated values).


Absolutely.

> 3. If I am not wrong, the iothread works in
> virtio_ballloon_get_free_page_hints() on the virtqueue only with holding
> the free_page_lock (no BQL).
> 
> Assume we're migrating, the iothread is active, and the guest triggers a
> device reset.
> 
> virtio_balloon_device_reset() will trigger a
> virtio_balloon_free_page_stop(s). That won't actually wait for the
> iothread to stop, it will only temporarily lock free_page_lock and
> update s->free_page_report_status.
> 
> I think there can be a race between the device reset and the iothread.
> Once virtio_balloon_free_page_stop() returned,
> virtio_ballloon_get_free_page_hints() can still call
> - virtio_queue_set_notification(vq, 0);
> - virtio_queue_set_notification(vq, 1);
> - virtio_notify(vdev, vq);
> - virtqueue_pop()
> 
> I doubt this is very nice.

Reset is notoriously hard to get right.

> There are other concerns I had regarding the iothread (e.g., while
> reporting is active, virtio_ballloon_get_free_page_hints() is
> essentially a busy loop, in contrast to documented -
> continue_to_get_hints will always be true).

So that would be a performance issue you are suggesting, right?

> > The appeal of hinting is that it's 0 overhead outside migration,
> > and pains were taken to avoid keeping pages locked while
> > hypervisor is busy.
> > 
> > If we are to drop hinting completely we need to show that reporting
> > can be comparable, and we'll probably want to add a mode for
> > reporting that behaves somewhat similarly.
> 
> Depends on the actual users. If we're dropping a feature that nobody is
> actively using, I don't think we have to show anything.


I don't know how to find out. So far it doesn't look like we found
any common data corruptions that would indicate no one can use it safely.
Races around reset aren't all that uncommon but I don't think that
qualifies as a deal breaker.

I find the idea of asynchronously sending hints to host without
waiting for them to be processed intriguing. Not something
I'd work on implementing if we had reporting originally,
but since it's there I'm not sure we should just discard it
at this point.

> This feature obviously saw no proper review.

I did my best but obviously missed some things.

> -- 
> Thanks,
> 
> David / dhildenb



^ permalink raw reply	[flat|nested] 44+ messages in thread

* [virtio-dev] Re: [PATCH v25 QEMU 3/3] virtio-balloon: Replace free page hinting references to 'report' with 'hint'
@ 2020-06-24 14:56                 ` Michael S. Tsirkin
  0 siblings, 0 replies; 44+ messages in thread
From: Michael S. Tsirkin @ 2020-06-24 14:56 UTC (permalink / raw)
  To: David Hildenbrand; +Cc: Alexander Duyck, virtio-dev, qemu-devel

On Thu, Jun 18, 2020 at 07:10:43PM +0200, David Hildenbrand wrote:
> >>
> >> Ugh, ...
> >>
> >> @MST, you might have missed that in another discussion, what's your
> >> general opinion about removing free page hinting in QEMU (and Linux)? We
> >> keep finding issues in the QEMU implementation, including non-trivial
> >> ones, and have to speculate about the actual semantics. I can see that
> >> e.g., libvirt does not support it yet.
> > 
> > Not maintaining two similar features sounds attractive.
> 
> I consider free page hinting (in QEMU) to be in an unmaintainable state
> (and it looks like Alex and I are fixing a feature we don't actually
> intend to use / not aware of users). In contrast to that, the free page
> reporting functionality/implementation is a walk in the park.

So at the high level the idea was simple, we just clear the dirty bit
when page is hinted, unless we sent a new command since. Implementation
was reviewed by migration maintainers. If there's a consensus the code
is written so badly we can't maintain it, maybe we should remove it.
Which parts are unmaintainable in your eyes - migration or virtio ones?
Or maybe it's the general thing that interface was never specced
properly.

> > 
> > I'm still trying to get my head around the list of issues.  So far they
> > all look kind of minor to me.  Would you like to summarize them
> > somewhere?
> 
> Some things I still have in my mind

Thanks for the summary!

> 
> 1. If migration fails during RAM precopy, the guest will never receive a
> DONE notification. Probably easy to fix.
> 
> 2. Unclear semantics. Alex tried to document what the actual semantics
> of hinted pages are.

I'll reply to that now.

> Assume the following in the guest to a previously
> hinted page
> 
> /* page was hinted and is reused now */
> if (page[x] != Y)
> 	page[x] == Y;
> /* migration ends, we now run on the destination */
> BUG_ON(page[x] != Y);
> /* BUG, because the content chan

The assumption hinting makes is that data in page is writtent to before it's used.


> A guest can observe that. And that could be a random driver that just
> allocated a page.
> 
> (I *assume* in Linux we might catch that using kasan, but I am not 100%
> sure, also, the actual semantics to document are unclear - e.g., for
> other guests)

I think it's basically simple: hinting means it's ok to
fill page with trash unless it has been modified since the command
ID supplied.

> As Alex mentioned, it is not even guaranteed in QEMU that we receive a
> zero page on the destination, it could also be something else (e.g.,
> previously migrated values).


Absolutely.

> 3. If I am not wrong, the iothread works in
> virtio_ballloon_get_free_page_hints() on the virtqueue only with holding
> the free_page_lock (no BQL).
> 
> Assume we're migrating, the iothread is active, and the guest triggers a
> device reset.
> 
> virtio_balloon_device_reset() will trigger a
> virtio_balloon_free_page_stop(s). That won't actually wait for the
> iothread to stop, it will only temporarily lock free_page_lock and
> update s->free_page_report_status.
> 
> I think there can be a race between the device reset and the iothread.
> Once virtio_balloon_free_page_stop() returned,
> virtio_ballloon_get_free_page_hints() can still call
> - virtio_queue_set_notification(vq, 0);
> - virtio_queue_set_notification(vq, 1);
> - virtio_notify(vdev, vq);
> - virtqueue_pop()
> 
> I doubt this is very nice.

Reset is notoriously hard to get right.

> There are other concerns I had regarding the iothread (e.g., while
> reporting is active, virtio_ballloon_get_free_page_hints() is
> essentially a busy loop, in contrast to documented -
> continue_to_get_hints will always be true).

So that would be a performance issue you are suggesting, right?

> > The appeal of hinting is that it's 0 overhead outside migration,
> > and pains were taken to avoid keeping pages locked while
> > hypervisor is busy.
> > 
> > If we are to drop hinting completely we need to show that reporting
> > can be comparable, and we'll probably want to add a mode for
> > reporting that behaves somewhat similarly.
> 
> Depends on the actual users. If we're dropping a feature that nobody is
> actively using, I don't think we have to show anything.


I don't know how to find out. So far it doesn't look like we found
any common data corruptions that would indicate no one can use it safely.
Races around reset aren't all that uncommon but I don't think that
qualifies as a deal breaker.

I find the idea of asynchronously sending hints to host without
waiting for them to be processed intriguing. Not something
I'd work on implementing if we had reporting originally,
but since it's there I'm not sure we should just discard it
at this point.

> This feature obviously saw no proper review.

I did my best but obviously missed some things.

> -- 
> Thanks,
> 
> David / dhildenb


---------------------------------------------------------------------
To unsubscribe, e-mail: virtio-dev-unsubscribe@lists.oasis-open.org
For additional commands, e-mail: virtio-dev-help@lists.oasis-open.org


^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [PATCH v25 QEMU 3/3] virtio-balloon: Replace free page hinting references to 'report' with 'hint'
  2020-06-24 14:56                 ` [virtio-dev] " Michael S. Tsirkin
@ 2020-06-24 15:28                   ` David Hildenbrand
  -1 siblings, 0 replies; 44+ messages in thread
From: David Hildenbrand @ 2020-06-24 15:28 UTC (permalink / raw)
  To: Michael S. Tsirkin; +Cc: virtio-dev, qemu-devel, Alexander Duyck

> So at the high level the idea was simple, we just clear the dirty bit
> when page is hinted, unless we sent a new command since. Implementation
> was reviewed by migration maintainers. If there's a consensus the code
> is written so badly we can't maintain it, maybe we should remove it.
> Which parts are unmaintainable in your eyes - migration or virtio ones?

QEMU implementation without a propert virtio specification. I hope that
we can *at least* finally document the expected behavior. Alex gave it a
shot, and I was hoping that Wei could jump in to clarify, help move this
forward ... after all he implemented (+designed?) the feature and the
virtio interface.

> Or maybe it's the general thing that interface was never specced
> properly.

Yes, a spec would be definitely a good starter ...

[...]

>>
>> 1. If migration fails during RAM precopy, the guest will never receive a
>> DONE notification. Probably easy to fix.
>>
>> 2. Unclear semantics. Alex tried to document what the actual semantics
>> of hinted pages are.
> 
> I'll reply to that now.
> 
>> Assume the following in the guest to a previously
>> hinted page
>>
>> /* page was hinted and is reused now */
>> if (page[x] != Y)
>> 	page[x] == Y;
>> /* migration ends, we now run on the destination */
>> BUG_ON(page[x] != Y);
>> /* BUG, because the content chan
> 
> The assumption hinting makes is that data in page is writtent to before it's used.
> 
> 
>> A guest can observe that. And that could be a random driver that just
>> allocated a page.
>>
>> (I *assume* in Linux we might catch that using kasan, but I am not 100%
>> sure, also, the actual semantics to document are unclear - e.g., for
>> other guests)
> 
> I think it's basically simple: hinting means it's ok to
> fill page with trash unless it has been modified since the command
> ID supplied.

Yeah, I quite dislike the semantics, especially, as they are different
to well-know semantics as e.g., represent in MADV_FREE. Getting changed
content when reading is really weird. But it seemed to be easier to
implement (low hanging fruit) and nobody complained back then. Well, now
we are stuck with it.

[..]

> 
>> There are other concerns I had regarding the iothread (e.g., while
>> reporting is active, virtio_ballloon_get_free_page_hints() is
>> essentially a busy loop, in contrast to documented -
>> continue_to_get_hints will always be true).
> 
> So that would be a performance issue you are suggesting, right?

I misread the code, so that comment does no longer apply (see other
message).

> 
>>> The appeal of hinting is that it's 0 overhead outside migration,
>>> and pains were taken to avoid keeping pages locked while
>>> hypervisor is busy.
>>>
>>> If we are to drop hinting completely we need to show that reporting
>>> can be comparable, and we'll probably want to add a mode for
>>> reporting that behaves somewhat similarly.
>>
>> Depends on the actual users. If we're dropping a feature that nobody is
>> actively using, I don't think we have to show anything.
> 
> 
> I don't know how to find out. So far it doesn't look like we found
> any common data corruptions that would indicate no one can use it safely.
> Races around reset aren't all that uncommon but I don't think that
> qualifies as a deal breaker.

As I said, there are no libvirt bindings, so at least anything using
libvirt does not use it. I'd be curious about actual users.

> 
> I find the idea of asynchronously sending hints to host without
> waiting for them to be processed intriguing. Not something
> I'd work on implementing if we had reporting originally,
> but since it's there I'm not sure we should just discard it
> at this point.
> 
>> This feature obviously saw no proper review.
> 
> I did my best but obviously missed some things.

Yeah, definitely not your fault. People cannot expect maintainers to
review everything in detail.

-- 
Thanks,

David / dhildenb



^ permalink raw reply	[flat|nested] 44+ messages in thread

* [virtio-dev] Re: [PATCH v25 QEMU 3/3] virtio-balloon: Replace free page hinting references to 'report' with 'hint'
@ 2020-06-24 15:28                   ` David Hildenbrand
  0 siblings, 0 replies; 44+ messages in thread
From: David Hildenbrand @ 2020-06-24 15:28 UTC (permalink / raw)
  To: Michael S. Tsirkin; +Cc: Alexander Duyck, virtio-dev, qemu-devel

> So at the high level the idea was simple, we just clear the dirty bit
> when page is hinted, unless we sent a new command since. Implementation
> was reviewed by migration maintainers. If there's a consensus the code
> is written so badly we can't maintain it, maybe we should remove it.
> Which parts are unmaintainable in your eyes - migration or virtio ones?

QEMU implementation without a propert virtio specification. I hope that
we can *at least* finally document the expected behavior. Alex gave it a
shot, and I was hoping that Wei could jump in to clarify, help move this
forward ... after all he implemented (+designed?) the feature and the
virtio interface.

> Or maybe it's the general thing that interface was never specced
> properly.

Yes, a spec would be definitely a good starter ...

[...]

>>
>> 1. If migration fails during RAM precopy, the guest will never receive a
>> DONE notification. Probably easy to fix.
>>
>> 2. Unclear semantics. Alex tried to document what the actual semantics
>> of hinted pages are.
> 
> I'll reply to that now.
> 
>> Assume the following in the guest to a previously
>> hinted page
>>
>> /* page was hinted and is reused now */
>> if (page[x] != Y)
>> 	page[x] == Y;
>> /* migration ends, we now run on the destination */
>> BUG_ON(page[x] != Y);
>> /* BUG, because the content chan
> 
> The assumption hinting makes is that data in page is writtent to before it's used.
> 
> 
>> A guest can observe that. And that could be a random driver that just
>> allocated a page.
>>
>> (I *assume* in Linux we might catch that using kasan, but I am not 100%
>> sure, also, the actual semantics to document are unclear - e.g., for
>> other guests)
> 
> I think it's basically simple: hinting means it's ok to
> fill page with trash unless it has been modified since the command
> ID supplied.

Yeah, I quite dislike the semantics, especially, as they are different
to well-know semantics as e.g., represent in MADV_FREE. Getting changed
content when reading is really weird. But it seemed to be easier to
implement (low hanging fruit) and nobody complained back then. Well, now
we are stuck with it.

[..]

> 
>> There are other concerns I had regarding the iothread (e.g., while
>> reporting is active, virtio_ballloon_get_free_page_hints() is
>> essentially a busy loop, in contrast to documented -
>> continue_to_get_hints will always be true).
> 
> So that would be a performance issue you are suggesting, right?

I misread the code, so that comment does no longer apply (see other
message).

> 
>>> The appeal of hinting is that it's 0 overhead outside migration,
>>> and pains were taken to avoid keeping pages locked while
>>> hypervisor is busy.
>>>
>>> If we are to drop hinting completely we need to show that reporting
>>> can be comparable, and we'll probably want to add a mode for
>>> reporting that behaves somewhat similarly.
>>
>> Depends on the actual users. If we're dropping a feature that nobody is
>> actively using, I don't think we have to show anything.
> 
> 
> I don't know how to find out. So far it doesn't look like we found
> any common data corruptions that would indicate no one can use it safely.
> Races around reset aren't all that uncommon but I don't think that
> qualifies as a deal breaker.

As I said, there are no libvirt bindings, so at least anything using
libvirt does not use it. I'd be curious about actual users.

> 
> I find the idea of asynchronously sending hints to host without
> waiting for them to be processed intriguing. Not something
> I'd work on implementing if we had reporting originally,
> but since it's there I'm not sure we should just discard it
> at this point.
> 
>> This feature obviously saw no proper review.
> 
> I did my best but obviously missed some things.

Yeah, definitely not your fault. People cannot expect maintainers to
review everything in detail.

-- 
Thanks,

David / dhildenb


---------------------------------------------------------------------
To unsubscribe, e-mail: virtio-dev-unsubscribe@lists.oasis-open.org
For additional commands, e-mail: virtio-dev-help@lists.oasis-open.org


^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [PATCH v25 QEMU 3/3] virtio-balloon: Replace free page hinting references to 'report' with 'hint'
  2020-06-24 15:28                   ` [virtio-dev] " David Hildenbrand
@ 2020-06-24 15:37                     ` Michael S. Tsirkin
  -1 siblings, 0 replies; 44+ messages in thread
From: Michael S. Tsirkin @ 2020-06-24 15:37 UTC (permalink / raw)
  To: David Hildenbrand; +Cc: virtio-dev, qemu-devel, Alexander Duyck

On Wed, Jun 24, 2020 at 05:28:59PM +0200, David Hildenbrand wrote:
> > So at the high level the idea was simple, we just clear the dirty bit
> > when page is hinted, unless we sent a new command since. Implementation
> > was reviewed by migration maintainers. If there's a consensus the code
> > is written so badly we can't maintain it, maybe we should remove it.
> > Which parts are unmaintainable in your eyes - migration or virtio ones?
> 
> QEMU implementation without a propert virtio specification. I hope that
> we can *at least* finally document the expected behavior. Alex gave it a
> shot, and I was hoping that Wei could jump in to clarify, help move this
> forward ... after all he implemented (+designed?) the feature and the
> virtio interface.
> 
> > Or maybe it's the general thing that interface was never specced
> > properly.
> 
> Yes, a spec would be definitely a good starter ...
> 
> [...]
> 
> >>
> >> 1. If migration fails during RAM precopy, the guest will never receive a
> >> DONE notification. Probably easy to fix.
> >>
> >> 2. Unclear semantics. Alex tried to document what the actual semantics
> >> of hinted pages are.
> > 
> > I'll reply to that now.
> > 
> >> Assume the following in the guest to a previously
> >> hinted page
> >>
> >> /* page was hinted and is reused now */
> >> if (page[x] != Y)
> >> 	page[x] == Y;
> >> /* migration ends, we now run on the destination */
> >> BUG_ON(page[x] != Y);
> >> /* BUG, because the content chan
> > 
> > The assumption hinting makes is that data in page is writtent to before it's used.
> > 
> > 
> >> A guest can observe that. And that could be a random driver that just
> >> allocated a page.
> >>
> >> (I *assume* in Linux we might catch that using kasan, but I am not 100%
> >> sure, also, the actual semantics to document are unclear - e.g., for
> >> other guests)
> > 
> > I think it's basically simple: hinting means it's ok to
> > fill page with trash unless it has been modified since the command
> > ID supplied.
> 
> Yeah, I quite dislike the semantics, especially, as they are different
> to well-know semantics as e.g., represent in MADV_FREE. Getting changed
> content when reading is really weird. But it seemed to be easier to
> implement (low hanging fruit) and nobody complained back then. Well, now
> we are stuck with it.
> 
> [..]

The difference with MADV_FREE is
- asynchronous (using cmd id to synchronize)
- zero not guaranteed

right?

> > 
> >> There are other concerns I had regarding the iothread (e.g., while
> >> reporting is active, virtio_ballloon_get_free_page_hints() is
> >> essentially a busy loop, in contrast to documented -
> >> continue_to_get_hints will always be true).
> > 
> > So that would be a performance issue you are suggesting, right?
> 
> I misread the code, so that comment does no longer apply (see other
> message).
> 
> > 
> >>> The appeal of hinting is that it's 0 overhead outside migration,
> >>> and pains were taken to avoid keeping pages locked while
> >>> hypervisor is busy.
> >>>
> >>> If we are to drop hinting completely we need to show that reporting
> >>> can be comparable, and we'll probably want to add a mode for
> >>> reporting that behaves somewhat similarly.
> >>
> >> Depends on the actual users. If we're dropping a feature that nobody is
> >> actively using, I don't think we have to show anything.
> > 
> > 
> > I don't know how to find out. So far it doesn't look like we found
> > any common data corruptions that would indicate no one can use it safely.
> > Races around reset aren't all that uncommon but I don't think that
> > qualifies as a deal breaker.
> 
> As I said, there are no libvirt bindings, so at least anything using
> libvirt does not use it. I'd be curious about actual users.
> 
> > 
> > I find the idea of asynchronously sending hints to host without
> > waiting for them to be processed intriguing. Not something
> > I'd work on implementing if we had reporting originally,
> > but since it's there I'm not sure we should just discard it
> > at this point.
> > 
> >> This feature obviously saw no proper review.
> > 
> > I did my best but obviously missed some things.
> 
> Yeah, definitely not your fault. People cannot expect maintainers to
> review everything in detail.
> 
> -- 
> Thanks,
> 
> David / dhildenb



^ permalink raw reply	[flat|nested] 44+ messages in thread

* [virtio-dev] Re: [PATCH v25 QEMU 3/3] virtio-balloon: Replace free page hinting references to 'report' with 'hint'
@ 2020-06-24 15:37                     ` Michael S. Tsirkin
  0 siblings, 0 replies; 44+ messages in thread
From: Michael S. Tsirkin @ 2020-06-24 15:37 UTC (permalink / raw)
  To: David Hildenbrand; +Cc: Alexander Duyck, virtio-dev, qemu-devel

On Wed, Jun 24, 2020 at 05:28:59PM +0200, David Hildenbrand wrote:
> > So at the high level the idea was simple, we just clear the dirty bit
> > when page is hinted, unless we sent a new command since. Implementation
> > was reviewed by migration maintainers. If there's a consensus the code
> > is written so badly we can't maintain it, maybe we should remove it.
> > Which parts are unmaintainable in your eyes - migration or virtio ones?
> 
> QEMU implementation without a propert virtio specification. I hope that
> we can *at least* finally document the expected behavior. Alex gave it a
> shot, and I was hoping that Wei could jump in to clarify, help move this
> forward ... after all he implemented (+designed?) the feature and the
> virtio interface.
> 
> > Or maybe it's the general thing that interface was never specced
> > properly.
> 
> Yes, a spec would be definitely a good starter ...
> 
> [...]
> 
> >>
> >> 1. If migration fails during RAM precopy, the guest will never receive a
> >> DONE notification. Probably easy to fix.
> >>
> >> 2. Unclear semantics. Alex tried to document what the actual semantics
> >> of hinted pages are.
> > 
> > I'll reply to that now.
> > 
> >> Assume the following in the guest to a previously
> >> hinted page
> >>
> >> /* page was hinted and is reused now */
> >> if (page[x] != Y)
> >> 	page[x] == Y;
> >> /* migration ends, we now run on the destination */
> >> BUG_ON(page[x] != Y);
> >> /* BUG, because the content chan
> > 
> > The assumption hinting makes is that data in page is writtent to before it's used.
> > 
> > 
> >> A guest can observe that. And that could be a random driver that just
> >> allocated a page.
> >>
> >> (I *assume* in Linux we might catch that using kasan, but I am not 100%
> >> sure, also, the actual semantics to document are unclear - e.g., for
> >> other guests)
> > 
> > I think it's basically simple: hinting means it's ok to
> > fill page with trash unless it has been modified since the command
> > ID supplied.
> 
> Yeah, I quite dislike the semantics, especially, as they are different
> to well-know semantics as e.g., represent in MADV_FREE. Getting changed
> content when reading is really weird. But it seemed to be easier to
> implement (low hanging fruit) and nobody complained back then. Well, now
> we are stuck with it.
> 
> [..]

The difference with MADV_FREE is
- asynchronous (using cmd id to synchronize)
- zero not guaranteed

right?

> > 
> >> There are other concerns I had regarding the iothread (e.g., while
> >> reporting is active, virtio_ballloon_get_free_page_hints() is
> >> essentially a busy loop, in contrast to documented -
> >> continue_to_get_hints will always be true).
> > 
> > So that would be a performance issue you are suggesting, right?
> 
> I misread the code, so that comment does no longer apply (see other
> message).
> 
> > 
> >>> The appeal of hinting is that it's 0 overhead outside migration,
> >>> and pains were taken to avoid keeping pages locked while
> >>> hypervisor is busy.
> >>>
> >>> If we are to drop hinting completely we need to show that reporting
> >>> can be comparable, and we'll probably want to add a mode for
> >>> reporting that behaves somewhat similarly.
> >>
> >> Depends on the actual users. If we're dropping a feature that nobody is
> >> actively using, I don't think we have to show anything.
> > 
> > 
> > I don't know how to find out. So far it doesn't look like we found
> > any common data corruptions that would indicate no one can use it safely.
> > Races around reset aren't all that uncommon but I don't think that
> > qualifies as a deal breaker.
> 
> As I said, there are no libvirt bindings, so at least anything using
> libvirt does not use it. I'd be curious about actual users.
> 
> > 
> > I find the idea of asynchronously sending hints to host without
> > waiting for them to be processed intriguing. Not something
> > I'd work on implementing if we had reporting originally,
> > but since it's there I'm not sure we should just discard it
> > at this point.
> > 
> >> This feature obviously saw no proper review.
> > 
> > I did my best but obviously missed some things.
> 
> Yeah, definitely not your fault. People cannot expect maintainers to
> review everything in detail.
> 
> -- 
> Thanks,
> 
> David / dhildenb


---------------------------------------------------------------------
To unsubscribe, e-mail: virtio-dev-unsubscribe@lists.oasis-open.org
For additional commands, e-mail: virtio-dev-help@lists.oasis-open.org


^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [PATCH v25 QEMU 3/3] virtio-balloon: Replace free page hinting references to 'report' with 'hint'
  2020-06-24 15:37                     ` [virtio-dev] " Michael S. Tsirkin
@ 2020-06-24 16:01                       ` David Hildenbrand
  -1 siblings, 0 replies; 44+ messages in thread
From: David Hildenbrand @ 2020-06-24 16:01 UTC (permalink / raw)
  To: Michael S. Tsirkin; +Cc: virtio-dev, qemu-devel, Alexander Duyck

On 24.06.20 17:37, Michael S. Tsirkin wrote:
> On Wed, Jun 24, 2020 at 05:28:59PM +0200, David Hildenbrand wrote:
>>> So at the high level the idea was simple, we just clear the dirty bit
>>> when page is hinted, unless we sent a new command since. Implementation
>>> was reviewed by migration maintainers. If there's a consensus the code
>>> is written so badly we can't maintain it, maybe we should remove it.
>>> Which parts are unmaintainable in your eyes - migration or virtio ones?
>>
>> QEMU implementation without a propert virtio specification. I hope that
>> we can *at least* finally document the expected behavior. Alex gave it a
>> shot, and I was hoping that Wei could jump in to clarify, help move this
>> forward ... after all he implemented (+designed?) the feature and the
>> virtio interface.
>>
>>> Or maybe it's the general thing that interface was never specced
>>> properly.
>>
>> Yes, a spec would be definitely a good starter ...
>>
>> [...]
>>
>>>>
>>>> 1. If migration fails during RAM precopy, the guest will never receive a
>>>> DONE notification. Probably easy to fix.
>>>>
>>>> 2. Unclear semantics. Alex tried to document what the actual semantics
>>>> of hinted pages are.
>>>
>>> I'll reply to that now.
>>>
>>>> Assume the following in the guest to a previously
>>>> hinted page
>>>>
>>>> /* page was hinted and is reused now */
>>>> if (page[x] != Y)
>>>> 	page[x] == Y;
>>>> /* migration ends, we now run on the destination */
>>>> BUG_ON(page[x] != Y);
>>>> /* BUG, because the content chan
>>>
>>> The assumption hinting makes is that data in page is writtent to before it's used.
>>>
>>>
>>>> A guest can observe that. And that could be a random driver that just
>>>> allocated a page.
>>>>
>>>> (I *assume* in Linux we might catch that using kasan, but I am not 100%
>>>> sure, also, the actual semantics to document are unclear - e.g., for
>>>> other guests)
>>>
>>> I think it's basically simple: hinting means it's ok to
>>> fill page with trash unless it has been modified since the command
>>> ID supplied.
>>
>> Yeah, I quite dislike the semantics, especially, as they are different
>> to well-know semantics as e.g., represent in MADV_FREE. Getting changed
>> content when reading is really weird. But it seemed to be easier to
>> implement (low hanging fruit) and nobody complained back then. Well, now
>> we are stuck with it.
>>
>> [..]
> 
> The difference with MADV_FREE is
> - asynchronous (using cmd id to synchronize)
> - zero not guaranteed
> 
> right?

*looking into man page*, yes, when reading you either get the old
content or zero.

(I remember that a re-read also makes the content stable, but looks like
you really have to write to a page)

We should most probably do what Alex suggested and initialize pages (at
least write a single byte) when leaking them from the shrinker in the
guest while hinting is active, such that the content is stable for
anybody to allocate and reuse a page.

-- 
Thanks,

David / dhildenb



^ permalink raw reply	[flat|nested] 44+ messages in thread

* [virtio-dev] Re: [PATCH v25 QEMU 3/3] virtio-balloon: Replace free page hinting references to 'report' with 'hint'
@ 2020-06-24 16:01                       ` David Hildenbrand
  0 siblings, 0 replies; 44+ messages in thread
From: David Hildenbrand @ 2020-06-24 16:01 UTC (permalink / raw)
  To: Michael S. Tsirkin; +Cc: Alexander Duyck, virtio-dev, qemu-devel

On 24.06.20 17:37, Michael S. Tsirkin wrote:
> On Wed, Jun 24, 2020 at 05:28:59PM +0200, David Hildenbrand wrote:
>>> So at the high level the idea was simple, we just clear the dirty bit
>>> when page is hinted, unless we sent a new command since. Implementation
>>> was reviewed by migration maintainers. If there's a consensus the code
>>> is written so badly we can't maintain it, maybe we should remove it.
>>> Which parts are unmaintainable in your eyes - migration or virtio ones?
>>
>> QEMU implementation without a propert virtio specification. I hope that
>> we can *at least* finally document the expected behavior. Alex gave it a
>> shot, and I was hoping that Wei could jump in to clarify, help move this
>> forward ... after all he implemented (+designed?) the feature and the
>> virtio interface.
>>
>>> Or maybe it's the general thing that interface was never specced
>>> properly.
>>
>> Yes, a spec would be definitely a good starter ...
>>
>> [...]
>>
>>>>
>>>> 1. If migration fails during RAM precopy, the guest will never receive a
>>>> DONE notification. Probably easy to fix.
>>>>
>>>> 2. Unclear semantics. Alex tried to document what the actual semantics
>>>> of hinted pages are.
>>>
>>> I'll reply to that now.
>>>
>>>> Assume the following in the guest to a previously
>>>> hinted page
>>>>
>>>> /* page was hinted and is reused now */
>>>> if (page[x] != Y)
>>>> 	page[x] == Y;
>>>> /* migration ends, we now run on the destination */
>>>> BUG_ON(page[x] != Y);
>>>> /* BUG, because the content chan
>>>
>>> The assumption hinting makes is that data in page is writtent to before it's used.
>>>
>>>
>>>> A guest can observe that. And that could be a random driver that just
>>>> allocated a page.
>>>>
>>>> (I *assume* in Linux we might catch that using kasan, but I am not 100%
>>>> sure, also, the actual semantics to document are unclear - e.g., for
>>>> other guests)
>>>
>>> I think it's basically simple: hinting means it's ok to
>>> fill page with trash unless it has been modified since the command
>>> ID supplied.
>>
>> Yeah, I quite dislike the semantics, especially, as they are different
>> to well-know semantics as e.g., represent in MADV_FREE. Getting changed
>> content when reading is really weird. But it seemed to be easier to
>> implement (low hanging fruit) and nobody complained back then. Well, now
>> we are stuck with it.
>>
>> [..]
> 
> The difference with MADV_FREE is
> - asynchronous (using cmd id to synchronize)
> - zero not guaranteed
> 
> right?

*looking into man page*, yes, when reading you either get the old
content or zero.

(I remember that a re-read also makes the content stable, but looks like
you really have to write to a page)

We should most probably do what Alex suggested and initialize pages (at
least write a single byte) when leaking them from the shrinker in the
guest while hinting is active, such that the content is stable for
anybody to allocate and reuse a page.

-- 
Thanks,

David / dhildenb


---------------------------------------------------------------------
To unsubscribe, e-mail: virtio-dev-unsubscribe@lists.oasis-open.org
For additional commands, e-mail: virtio-dev-help@lists.oasis-open.org


^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [PATCH v25 QEMU 3/3] virtio-balloon: Replace free page hinting references to 'report' with 'hint'
  2020-06-24 16:01                       ` [virtio-dev] " David Hildenbrand
@ 2020-06-24 20:36                         ` Michael S. Tsirkin
  -1 siblings, 0 replies; 44+ messages in thread
From: Michael S. Tsirkin @ 2020-06-24 20:36 UTC (permalink / raw)
  To: David Hildenbrand; +Cc: virtio-dev, qemu-devel, Alexander Duyck

On Wed, Jun 24, 2020 at 06:01:02PM +0200, David Hildenbrand wrote:
> On 24.06.20 17:37, Michael S. Tsirkin wrote:
> > On Wed, Jun 24, 2020 at 05:28:59PM +0200, David Hildenbrand wrote:
> >>> So at the high level the idea was simple, we just clear the dirty bit
> >>> when page is hinted, unless we sent a new command since. Implementation
> >>> was reviewed by migration maintainers. If there's a consensus the code
> >>> is written so badly we can't maintain it, maybe we should remove it.
> >>> Which parts are unmaintainable in your eyes - migration or virtio ones?
> >>
> >> QEMU implementation without a propert virtio specification. I hope that
> >> we can *at least* finally document the expected behavior. Alex gave it a
> >> shot, and I was hoping that Wei could jump in to clarify, help move this
> >> forward ... after all he implemented (+designed?) the feature and the
> >> virtio interface.
> >>
> >>> Or maybe it's the general thing that interface was never specced
> >>> properly.
> >>
> >> Yes, a spec would be definitely a good starter ...
> >>
> >> [...]
> >>
> >>>>
> >>>> 1. If migration fails during RAM precopy, the guest will never receive a
> >>>> DONE notification. Probably easy to fix.
> >>>>
> >>>> 2. Unclear semantics. Alex tried to document what the actual semantics
> >>>> of hinted pages are.
> >>>
> >>> I'll reply to that now.
> >>>
> >>>> Assume the following in the guest to a previously
> >>>> hinted page
> >>>>
> >>>> /* page was hinted and is reused now */
> >>>> if (page[x] != Y)
> >>>> 	page[x] == Y;
> >>>> /* migration ends, we now run on the destination */
> >>>> BUG_ON(page[x] != Y);
> >>>> /* BUG, because the content chan
> >>>
> >>> The assumption hinting makes is that data in page is writtent to before it's used.
> >>>
> >>>
> >>>> A guest can observe that. And that could be a random driver that just
> >>>> allocated a page.
> >>>>
> >>>> (I *assume* in Linux we might catch that using kasan, but I am not 100%
> >>>> sure, also, the actual semantics to document are unclear - e.g., for
> >>>> other guests)
> >>>
> >>> I think it's basically simple: hinting means it's ok to
> >>> fill page with trash unless it has been modified since the command
> >>> ID supplied.
> >>
> >> Yeah, I quite dislike the semantics, especially, as they are different
> >> to well-know semantics as e.g., represent in MADV_FREE. Getting changed
> >> content when reading is really weird. But it seemed to be easier to
> >> implement (low hanging fruit) and nobody complained back then. Well, now
> >> we are stuck with it.
> >>
> >> [..]
> > 
> > The difference with MADV_FREE is
> > - asynchronous (using cmd id to synchronize)
> > - zero not guaranteed
> > 
> > right?
> 
> *looking into man page*, yes, when reading you either get the old
> content or zero.
> 
> (I remember that a re-read also makes the content stable, but looks like
> you really have to write to a page)
> 
> We should most probably do what Alex suggested and initialize pages (at
> least write a single byte) when leaking them from the shrinker in the
> guest while hinting is active, such that the content is stable for
> anybody to allocate and reuse a page.

Drivers ignore old content from slab though, so I don't really see
the point.

> -- 
> Thanks,
> 
> David / dhildenb



^ permalink raw reply	[flat|nested] 44+ messages in thread

* [virtio-dev] Re: [PATCH v25 QEMU 3/3] virtio-balloon: Replace free page hinting references to 'report' with 'hint'
@ 2020-06-24 20:36                         ` Michael S. Tsirkin
  0 siblings, 0 replies; 44+ messages in thread
From: Michael S. Tsirkin @ 2020-06-24 20:36 UTC (permalink / raw)
  To: David Hildenbrand; +Cc: Alexander Duyck, virtio-dev, qemu-devel

On Wed, Jun 24, 2020 at 06:01:02PM +0200, David Hildenbrand wrote:
> On 24.06.20 17:37, Michael S. Tsirkin wrote:
> > On Wed, Jun 24, 2020 at 05:28:59PM +0200, David Hildenbrand wrote:
> >>> So at the high level the idea was simple, we just clear the dirty bit
> >>> when page is hinted, unless we sent a new command since. Implementation
> >>> was reviewed by migration maintainers. If there's a consensus the code
> >>> is written so badly we can't maintain it, maybe we should remove it.
> >>> Which parts are unmaintainable in your eyes - migration or virtio ones?
> >>
> >> QEMU implementation without a propert virtio specification. I hope that
> >> we can *at least* finally document the expected behavior. Alex gave it a
> >> shot, and I was hoping that Wei could jump in to clarify, help move this
> >> forward ... after all he implemented (+designed?) the feature and the
> >> virtio interface.
> >>
> >>> Or maybe it's the general thing that interface was never specced
> >>> properly.
> >>
> >> Yes, a spec would be definitely a good starter ...
> >>
> >> [...]
> >>
> >>>>
> >>>> 1. If migration fails during RAM precopy, the guest will never receive a
> >>>> DONE notification. Probably easy to fix.
> >>>>
> >>>> 2. Unclear semantics. Alex tried to document what the actual semantics
> >>>> of hinted pages are.
> >>>
> >>> I'll reply to that now.
> >>>
> >>>> Assume the following in the guest to a previously
> >>>> hinted page
> >>>>
> >>>> /* page was hinted and is reused now */
> >>>> if (page[x] != Y)
> >>>> 	page[x] == Y;
> >>>> /* migration ends, we now run on the destination */
> >>>> BUG_ON(page[x] != Y);
> >>>> /* BUG, because the content chan
> >>>
> >>> The assumption hinting makes is that data in page is writtent to before it's used.
> >>>
> >>>
> >>>> A guest can observe that. And that could be a random driver that just
> >>>> allocated a page.
> >>>>
> >>>> (I *assume* in Linux we might catch that using kasan, but I am not 100%
> >>>> sure, also, the actual semantics to document are unclear - e.g., for
> >>>> other guests)
> >>>
> >>> I think it's basically simple: hinting means it's ok to
> >>> fill page with trash unless it has been modified since the command
> >>> ID supplied.
> >>
> >> Yeah, I quite dislike the semantics, especially, as they are different
> >> to well-know semantics as e.g., represent in MADV_FREE. Getting changed
> >> content when reading is really weird. But it seemed to be easier to
> >> implement (low hanging fruit) and nobody complained back then. Well, now
> >> we are stuck with it.
> >>
> >> [..]
> > 
> > The difference with MADV_FREE is
> > - asynchronous (using cmd id to synchronize)
> > - zero not guaranteed
> > 
> > right?
> 
> *looking into man page*, yes, when reading you either get the old
> content or zero.
> 
> (I remember that a re-read also makes the content stable, but looks like
> you really have to write to a page)
> 
> We should most probably do what Alex suggested and initialize pages (at
> least write a single byte) when leaking them from the shrinker in the
> guest while hinting is active, such that the content is stable for
> anybody to allocate and reuse a page.

Drivers ignore old content from slab though, so I don't really see
the point.

> -- 
> Thanks,
> 
> David / dhildenb


---------------------------------------------------------------------
To unsubscribe, e-mail: virtio-dev-unsubscribe@lists.oasis-open.org
For additional commands, e-mail: virtio-dev-help@lists.oasis-open.org


^ permalink raw reply	[flat|nested] 44+ messages in thread

* Re: [PATCH v25 QEMU 3/3] virtio-balloon: Replace free page hinting references to 'report' with 'hint'
  2020-06-24 20:36                         ` [virtio-dev] " Michael S. Tsirkin
@ 2020-06-24 21:06                           ` David Hildenbrand
  -1 siblings, 0 replies; 44+ messages in thread
From: David Hildenbrand @ 2020-06-24 21:06 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: virtio-dev, qemu-devel, Alexander Duyck, David Hildenbrand



> Am 24.06.2020 um 22:36 schrieb Michael S. Tsirkin <mst@redhat.com>:
> 
> On Wed, Jun 24, 2020 at 06:01:02PM +0200, David Hildenbrand wrote:
>>> On 24.06.20 17:37, Michael S. Tsirkin wrote:
>>> On Wed, Jun 24, 2020 at 05:28:59PM +0200, David Hildenbrand wrote:
>>>>> So at the high level the idea was simple, we just clear the dirty bit
>>>>> when page is hinted, unless we sent a new command since. Implementation
>>>>> was reviewed by migration maintainers. If there's a consensus the code
>>>>> is written so badly we can't maintain it, maybe we should remove it.
>>>>> Which parts are unmaintainable in your eyes - migration or virtio ones?
>>>> 
>>>> QEMU implementation without a propert virtio specification. I hope that
>>>> we can *at least* finally document the expected behavior. Alex gave it a
>>>> shot, and I was hoping that Wei could jump in to clarify, help move this
>>>> forward ... after all he implemented (+designed?) the feature and the
>>>> virtio interface.
>>>> 
>>>>> Or maybe it's the general thing that interface was never specced
>>>>> properly.
>>>> 
>>>> Yes, a spec would be definitely a good starter ...
>>>> 
>>>> [...]
>>>> 
>>>>>> 
>>>>>> 1. If migration fails during RAM precopy, the guest will never receive a
>>>>>> DONE notification. Probably easy to fix.
>>>>>> 
>>>>>> 2. Unclear semantics. Alex tried to document what the actual semantics
>>>>>> of hinted pages are.
>>>>> 
>>>>> I'll reply to that now.
>>>>> 
>>>>>> Assume the following in the guest to a previously
>>>>>> hinted page
>>>>>> 
>>>>>> /* page was hinted and is reused now */
>>>>>> if (page[x] != Y)
>>>>>>    page[x] == Y;
>>>>>> /* migration ends, we now run on the destination */
>>>>>> BUG_ON(page[x] != Y);
>>>>>> /* BUG, because the content chan
>>>>> 
>>>>> The assumption hinting makes is that data in page is writtent to before it's used.
>>>>> 
>>>>> 
>>>>>> A guest can observe that. And that could be a random driver that just
>>>>>> allocated a page.
>>>>>> 
>>>>>> (I *assume* in Linux we might catch that using kasan, but I am not 100%
>>>>>> sure, also, the actual semantics to document are unclear - e.g., for
>>>>>> other guests)
>>>>> 
>>>>> I think it's basically simple: hinting means it's ok to
>>>>> fill page with trash unless it has been modified since the command
>>>>> ID supplied.
>>>> 
>>>> Yeah, I quite dislike the semantics, especially, as they are different
>>>> to well-know semantics as e.g., represent in MADV_FREE. Getting changed
>>>> content when reading is really weird. But it seemed to be easier to
>>>> implement (low hanging fruit) and nobody complained back then. Well, now
>>>> we are stuck with it.
>>>> 
>>>> [..]
>>> 
>>> The difference with MADV_FREE is
>>> - asynchronous (using cmd id to synchronize)
>>> - zero not guaranteed
>>> 
>>> right?
>> 
>> *looking into man page*, yes, when reading you either get the old
>> content or zero.
>> 
>> (I remember that a re-read also makes the content stable, but looks like
>> you really have to write to a page)
>> 
>> We should most probably do what Alex suggested and initialize pages (at
>> least write a single byte) when leaking them from the shrinker in the
>> guest while hinting is active, such that the content is stable for
>> anybody to allocate and reuse a page.
> 
> Drivers ignore old content from slab though, so I don't really see
> the point.
> 

That‘s what we‘re hoping for and what we would expect. Maybe we should just life with that assumption and hope for the best ...

>> -- 
>> Thanks,
>> 
>> David / dhildenb
> 



^ permalink raw reply	[flat|nested] 44+ messages in thread

* [virtio-dev] Re: [PATCH v25 QEMU 3/3] virtio-balloon: Replace free page hinting references to 'report' with 'hint'
@ 2020-06-24 21:06                           ` David Hildenbrand
  0 siblings, 0 replies; 44+ messages in thread
From: David Hildenbrand @ 2020-06-24 21:06 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: David Hildenbrand, Alexander Duyck, virtio-dev, qemu-devel



> Am 24.06.2020 um 22:36 schrieb Michael S. Tsirkin <mst@redhat.com>:
> 
> On Wed, Jun 24, 2020 at 06:01:02PM +0200, David Hildenbrand wrote:
>>> On 24.06.20 17:37, Michael S. Tsirkin wrote:
>>> On Wed, Jun 24, 2020 at 05:28:59PM +0200, David Hildenbrand wrote:
>>>>> So at the high level the idea was simple, we just clear the dirty bit
>>>>> when page is hinted, unless we sent a new command since. Implementation
>>>>> was reviewed by migration maintainers. If there's a consensus the code
>>>>> is written so badly we can't maintain it, maybe we should remove it.
>>>>> Which parts are unmaintainable in your eyes - migration or virtio ones?
>>>> 
>>>> QEMU implementation without a propert virtio specification. I hope that
>>>> we can *at least* finally document the expected behavior. Alex gave it a
>>>> shot, and I was hoping that Wei could jump in to clarify, help move this
>>>> forward ... after all he implemented (+designed?) the feature and the
>>>> virtio interface.
>>>> 
>>>>> Or maybe it's the general thing that interface was never specced
>>>>> properly.
>>>> 
>>>> Yes, a spec would be definitely a good starter ...
>>>> 
>>>> [...]
>>>> 
>>>>>> 
>>>>>> 1. If migration fails during RAM precopy, the guest will never receive a
>>>>>> DONE notification. Probably easy to fix.
>>>>>> 
>>>>>> 2. Unclear semantics. Alex tried to document what the actual semantics
>>>>>> of hinted pages are.
>>>>> 
>>>>> I'll reply to that now.
>>>>> 
>>>>>> Assume the following in the guest to a previously
>>>>>> hinted page
>>>>>> 
>>>>>> /* page was hinted and is reused now */
>>>>>> if (page[x] != Y)
>>>>>>    page[x] == Y;
>>>>>> /* migration ends, we now run on the destination */
>>>>>> BUG_ON(page[x] != Y);
>>>>>> /* BUG, because the content chan
>>>>> 
>>>>> The assumption hinting makes is that data in page is writtent to before it's used.
>>>>> 
>>>>> 
>>>>>> A guest can observe that. And that could be a random driver that just
>>>>>> allocated a page.
>>>>>> 
>>>>>> (I *assume* in Linux we might catch that using kasan, but I am not 100%
>>>>>> sure, also, the actual semantics to document are unclear - e.g., for
>>>>>> other guests)
>>>>> 
>>>>> I think it's basically simple: hinting means it's ok to
>>>>> fill page with trash unless it has been modified since the command
>>>>> ID supplied.
>>>> 
>>>> Yeah, I quite dislike the semantics, especially, as they are different
>>>> to well-know semantics as e.g., represent in MADV_FREE. Getting changed
>>>> content when reading is really weird. But it seemed to be easier to
>>>> implement (low hanging fruit) and nobody complained back then. Well, now
>>>> we are stuck with it.
>>>> 
>>>> [..]
>>> 
>>> The difference with MADV_FREE is
>>> - asynchronous (using cmd id to synchronize)
>>> - zero not guaranteed
>>> 
>>> right?
>> 
>> *looking into man page*, yes, when reading you either get the old
>> content or zero.
>> 
>> (I remember that a re-read also makes the content stable, but looks like
>> you really have to write to a page)
>> 
>> We should most probably do what Alex suggested and initialize pages (at
>> least write a single byte) when leaking them from the shrinker in the
>> guest while hinting is active, such that the content is stable for
>> anybody to allocate and reuse a page.
> 
> Drivers ignore old content from slab though, so I don't really see
> the point.
> 

That‘s what we‘re hoping for and what we would expect. Maybe we should just life with that assumption and hope for the best ...

>> -- 
>> Thanks,
>> 
>> David / dhildenb
> 


---------------------------------------------------------------------
To unsubscribe, e-mail: virtio-dev-unsubscribe@lists.oasis-open.org
For additional commands, e-mail: virtio-dev-help@lists.oasis-open.org


^ permalink raw reply	[flat|nested] 44+ messages in thread

end of thread, other threads:[~2020-06-24 21:10 UTC | newest]

Thread overview: 44+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-05-27  4:13 [PATCH v25 QEMU 0/3] virtio-balloon: add support for page poison and free page reporting Alexander Duyck
2020-05-27  4:13 ` [virtio-dev] " Alexander Duyck
2020-05-27  4:14 ` [PATCH v25 QEMU 1/3] virtio-balloon: Implement support for page poison reporting feature Alexander Duyck
2020-05-27  4:14   ` [virtio-dev] " Alexander Duyck
2020-05-27  4:14 ` [PATCH v25 QEMU 2/3] virtio-balloon: Provide an interface for free page reporting Alexander Duyck
2020-05-27  4:14   ` [virtio-dev] " Alexander Duyck
2020-05-27  4:14 ` [PATCH v25 QEMU 3/3] virtio-balloon: Replace free page hinting references to 'report' with 'hint' Alexander Duyck
2020-05-27  4:14   ` [virtio-dev] " Alexander Duyck
2020-06-13 20:07   ` Alexander Duyck
2020-06-13 20:07     ` [virtio-dev] " Alexander Duyck
2020-06-18 12:54     ` David Hildenbrand
2020-06-18 12:54       ` [virtio-dev] " David Hildenbrand
2020-06-18 15:14       ` Alexander Duyck
2020-06-18 15:14         ` [virtio-dev] " Alexander Duyck
2020-06-18 15:58         ` David Hildenbrand
2020-06-18 15:58           ` [virtio-dev] " David Hildenbrand
2020-06-18 16:09           ` Michael S. Tsirkin
2020-06-18 16:09             ` [virtio-dev] " Michael S. Tsirkin
2020-06-18 17:10             ` David Hildenbrand
2020-06-18 17:10               ` [virtio-dev] " David Hildenbrand
2020-06-18 17:20               ` David Hildenbrand
2020-06-18 17:20                 ` [virtio-dev] " David Hildenbrand
2020-06-18 19:45               ` Alexander Duyck
2020-06-18 19:45                 ` [virtio-dev] " Alexander Duyck
2020-06-18 20:20                 ` David Hildenbrand
2020-06-18 20:20                   ` [virtio-dev] " David Hildenbrand
2020-06-24 14:56               ` Michael S. Tsirkin
2020-06-24 14:56                 ` [virtio-dev] " Michael S. Tsirkin
2020-06-24 15:28                 ` David Hildenbrand
2020-06-24 15:28                   ` [virtio-dev] " David Hildenbrand
2020-06-24 15:37                   ` Michael S. Tsirkin
2020-06-24 15:37                     ` [virtio-dev] " Michael S. Tsirkin
2020-06-24 16:01                     ` David Hildenbrand
2020-06-24 16:01                       ` [virtio-dev] " David Hildenbrand
2020-06-24 20:36                       ` Michael S. Tsirkin
2020-06-24 20:36                         ` [virtio-dev] " Michael S. Tsirkin
2020-06-24 21:06                         ` David Hildenbrand
2020-06-24 21:06                           ` [virtio-dev] " David Hildenbrand
2020-06-18 19:00     ` Dr. David Alan Gilbert
2020-06-18 19:00       ` Dr. David Alan Gilbert
2020-06-08 15:37 ` [PATCH v25 QEMU 0/3] virtio-balloon: add support for page poison and free page reporting Alexander Duyck
2020-06-08 15:37   ` [virtio-dev] " Alexander Duyck
2020-06-08 16:27   ` Michael S. Tsirkin
2020-06-08 16:27     ` [virtio-dev] " Michael S. Tsirkin

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.