qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed
* [Qemu-devel] [PATCH] ioapic: kvm: Skip route updates for masked pins
@ 2019-06-02 11:42 ` Jan Kiszka
  2019-06-02 12:10   ` Peter Xu
                     ` (2 more replies)
  0 siblings, 3 replies; 53+ messages in thread
From: Jan Kiszka @ 2019-06-02 11:42 UTC (permalink / raw)
  To: qemu-devel; +Cc: Paolo Bonzini, Peter Xu, Michael S. Tsirkin

From: Jan Kiszka <jan.kiszka@siemens.com>

Masked entries will not generate interrupt messages, thus do no need to
be routed by KVM. This is a cosmetic cleanup, just avoiding warnings of
the kind

qemu-system-x86_64: vtd_irte_get: detected non-present IRTE (index=0, high=0xff00, low=0x100)

if the masked entry happens to reference a non-present IRTE.

Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
---
 hw/intc/ioapic.c | 8 +++++---
 1 file changed, 5 insertions(+), 3 deletions(-)

diff --git a/hw/intc/ioapic.c b/hw/intc/ioapic.c
index 7074489fdf..2fb288a22d 100644
--- a/hw/intc/ioapic.c
+++ b/hw/intc/ioapic.c
@@ -197,9 +197,11 @@ static void ioapic_update_kvm_routes(IOAPICCommonState *s)
             MSIMessage msg;
             struct ioapic_entry_info info;
             ioapic_entry_parse(s->ioredtbl[i], &info);
-            msg.address = info.addr;
-            msg.data = info.data;
-            kvm_irqchip_update_msi_route(kvm_state, i, msg, NULL);
+            if (!info.masked) {
+                msg.address = info.addr;
+                msg.data = info.data;
+                kvm_irqchip_update_msi_route(kvm_state, i, msg, NULL);
+            }
         }
         kvm_irqchip_commit_routes(kvm_state);
     }
-- 
2.16.4


^ permalink raw reply related	[flat|nested] 53+ messages in thread

* Re: [Qemu-devel] [PATCH] ioapic: kvm: Skip route updates for masked pins
  2019-06-02 11:42 ` [Qemu-devel] [PATCH] ioapic: kvm: Skip route updates for masked pins Jan Kiszka
@ 2019-06-02 12:10   ` Peter Xu
  2019-06-03  6:30     ` Jan Kiszka
  2019-06-03  0:36   ` Michael S. Tsirkin
  2019-07-25 15:31   ` [Qemu-devel] [PULL 03/12] " Michael S. Tsirkin
  2 siblings, 1 reply; 53+ messages in thread
From: Peter Xu @ 2019-06-02 12:10 UTC (permalink / raw)
  To: Jan Kiszka; +Cc: Paolo Bonzini, qemu-devel, Michael S. Tsirkin

On Sun, Jun 02, 2019 at 01:42:13PM +0200, Jan Kiszka wrote:
> From: Jan Kiszka <jan.kiszka@siemens.com>
> 
> Masked entries will not generate interrupt messages, thus do no need to
> be routed by KVM. This is a cosmetic cleanup, just avoiding warnings of
> the kind
> 
> qemu-system-x86_64: vtd_irte_get: detected non-present IRTE (index=0, high=0xff00, low=0x100)
> 
> if the masked entry happens to reference a non-present IRTE.
> 
> Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>

Reviewed-by: Peter Xu <peterx@redhat.com>

Thanks, Jan.

The "non-cosmetic" part of clearing of those entries (e.g. including
when the entries were not setup correctly rather than masked) was
never really implemented and that task has been on my todo list for
quite a while but with a very low priority (low enough to sink...).  I
hope I didn't overlook its importance since AFAICT general OSs should
hardly trigger those paths and so far I don't see it a very big issue.

Regards,

-- 
Peter Xu


^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: [Qemu-devel] [PATCH] ioapic: kvm: Skip route updates for masked pins
  2019-06-02 11:42 ` [Qemu-devel] [PATCH] ioapic: kvm: Skip route updates for masked pins Jan Kiszka
  2019-06-02 12:10   ` Peter Xu
@ 2019-06-03  0:36   ` Michael S. Tsirkin
  2019-07-21  8:58     ` Jan Kiszka
  2019-07-25 15:31   ` [Qemu-devel] [PULL 03/12] " Michael S. Tsirkin
  2 siblings, 1 reply; 53+ messages in thread
From: Michael S. Tsirkin @ 2019-06-03  0:36 UTC (permalink / raw)
  To: Jan Kiszka; +Cc: Paolo Bonzini, qemu-devel, Peter Xu

On Sun, Jun 02, 2019 at 01:42:13PM +0200, Jan Kiszka wrote:
> From: Jan Kiszka <jan.kiszka@siemens.com>
> 
> Masked entries will not generate interrupt messages, thus do no need to
> be routed by KVM. This is a cosmetic cleanup, just avoiding warnings of
> the kind
> 
> qemu-system-x86_64: vtd_irte_get: detected non-present IRTE (index=0, high=0xff00, low=0x100)
> 
> if the masked entry happens to reference a non-present IRTE.
> 
> Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>

Reviewed-by: Michael S. Tsirkin <mst@redhat.com>

> ---
>  hw/intc/ioapic.c | 8 +++++---
>  1 file changed, 5 insertions(+), 3 deletions(-)
> 
> diff --git a/hw/intc/ioapic.c b/hw/intc/ioapic.c
> index 7074489fdf..2fb288a22d 100644
> --- a/hw/intc/ioapic.c
> +++ b/hw/intc/ioapic.c
> @@ -197,9 +197,11 @@ static void ioapic_update_kvm_routes(IOAPICCommonState *s)
>              MSIMessage msg;
>              struct ioapic_entry_info info;
>              ioapic_entry_parse(s->ioredtbl[i], &info);
> -            msg.address = info.addr;
> -            msg.data = info.data;
> -            kvm_irqchip_update_msi_route(kvm_state, i, msg, NULL);
> +            if (!info.masked) {
> +                msg.address = info.addr;
> +                msg.data = info.data;
> +                kvm_irqchip_update_msi_route(kvm_state, i, msg, NULL);
> +            }
>          }
>          kvm_irqchip_commit_routes(kvm_state);
>      }
> -- 
> 2.16.4


^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: [Qemu-devel] [PATCH] ioapic: kvm: Skip route updates for masked pins
  2019-06-02 12:10   ` Peter Xu
@ 2019-06-03  6:30     ` Jan Kiszka
  0 siblings, 0 replies; 53+ messages in thread
From: Jan Kiszka @ 2019-06-03  6:30 UTC (permalink / raw)
  To: Peter Xu; +Cc: Paolo Bonzini, qemu-devel, Michael S. Tsirkin

On 02.06.19 14:10, Peter Xu wrote:
> On Sun, Jun 02, 2019 at 01:42:13PM +0200, Jan Kiszka wrote:
>> From: Jan Kiszka <jan.kiszka@siemens.com>
>>
>> Masked entries will not generate interrupt messages, thus do no need to
>> be routed by KVM. This is a cosmetic cleanup, just avoiding warnings of
>> the kind
>>
>> qemu-system-x86_64: vtd_irte_get: detected non-present IRTE (index=0, high=0xff00, low=0x100)
>>
>> if the masked entry happens to reference a non-present IRTE.
>>
>> Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
> 
> Reviewed-by: Peter Xu <peterx@redhat.com>
> 
> Thanks, Jan.
> 
> The "non-cosmetic" part of clearing of those entries (e.g. including
> when the entries were not setup correctly rather than masked) was
> never really implemented and that task has been on my todo list for
> quite a while but with a very low priority (low enough to sink...).  I
> hope I didn't overlook its importance since AFAICT general OSs should
> hardly trigger those paths and so far I don't see it a very big issue.

I triggered the message during the handover phase from Linux to Jailhouse. That 
involves reprogramming both IOAPIC and IRTEs - likely unusual sequences, I just 
didn't find invalid ones.

Thanks,
Jan

-- 
Siemens AG, Corporate Technology, CT RDA IOT SES-DE
Corporate Competence Center Embedded Linux


^ permalink raw reply	[flat|nested] 53+ messages in thread

* [Qemu-devel] [PATCH] docs: clarify multiqueue vs multiple virtqueues
@ 2019-06-24  9:13 ` Stefan Hajnoczi
  2019-06-24 10:19   ` Marc-André Lureau
                     ` (2 more replies)
  0 siblings, 3 replies; 53+ messages in thread
From: Stefan Hajnoczi @ 2019-06-24  9:13 UTC (permalink / raw)
  To: qemu-devel
  Cc: Marc-André Lureau, Sebastien Boeuf, Cathy Zhang,
	Stefan Hajnoczi, Michael S. Tsirkin

The vhost-user specification does not explain when
VHOST_USER_PROTOCOL_F_MQ must be implemented.  This may lead
implementors of vhost-user masters to believe that this protocol feature
is required for any device that has multiple virtqueues.  That would be
a mistake since existing vhost-user slaves offer multiple virtqueues but
do not advertise VHOST_USER_PROTOCOL_F_MQ.

For example, a vhost-net device with one rx/tx queue pair is not
multiqueue.  The slave does not need to advertise
VHOST_USER_PROTOCOL_F_MQ.  Therefore the master must assume it has these
virtqueues and cannot rely on askingt the slave how many virtqueues
exist.

Extend the specification to explain the different between true
multiqueue and regular devices with a fixed virtqueue layout.

Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
---
Based-on: <20190621094005.4134-1-stefanha@redhat.com>
---
 docs/interop/vhost-user.rst | 17 +++++++++++++++++
 1 file changed, 17 insertions(+)

diff --git a/docs/interop/vhost-user.rst b/docs/interop/vhost-user.rst
index 5750668aba..7827b710aa 100644
--- a/docs/interop/vhost-user.rst
+++ b/docs/interop/vhost-user.rst
@@ -324,6 +324,15 @@ must support changing some configuration aspects on the fly.
 Multiple queue support
 ----------------------
 
+Many devices have a fixed number of virtqueues.  In this case the master
+already knows the number of available virtqueues without communicating with the
+slave.
+
+Some devices do not have a fixed number of virtqueues.  Instead the maximum
+number of virtqueues is chosen by the slave.  The number can depend on host
+resource availability or slave implementation details.  Such devices are called
+multiple queue devices.
+
 Multiple queue support allows the slave to advertise the maximum number of
 queues.  This is treated as a protocol extension, hence the slave has to
 implement protocol features first. The multiple queues feature is supported
@@ -339,6 +348,14 @@ queue in the sent message to identify a specified queue.
 The master enables queues by sending message ``VHOST_USER_SET_VRING_ENABLE``.
 vhost-user-net has historically automatically enabled the first queue pair.
 
+Slaves should always implement the ``VHOST_USER_PROTOCOL_F_MQ`` protocol
+feature, even for devices with a fixed number of virtqueues, since it is simple
+to implement and offers a degree of introspection.
+
+Masters must not rely on the ``VHOST_USER_PROTOCOL_F_MQ`` protocol feature for
+devices with a fixed number of virtqueues.  Only true multiqueue devices
+require this protocol feature.
+
 Migration
 ---------
 
-- 
2.21.0



^ permalink raw reply related	[flat|nested] 53+ messages in thread

* Re: [Qemu-devel] [PATCH] docs: clarify multiqueue vs multiple virtqueues
  2019-06-24  9:13 ` [Qemu-devel] [PATCH] docs: clarify multiqueue vs multiple virtqueues Stefan Hajnoczi
@ 2019-06-24 10:19   ` Marc-André Lureau
  2019-07-17 10:14   ` Stefan Hajnoczi
  2019-07-25 15:31   ` [Qemu-devel] [PULL 01/12] " Michael S. Tsirkin
  2 siblings, 0 replies; 53+ messages in thread
From: Marc-André Lureau @ 2019-06-24 10:19 UTC (permalink / raw)
  To: Stefan Hajnoczi; +Cc: Sebastien Boeuf, Michael S. Tsirkin, QEMU, Cathy Zhang

Hi

On Mon, Jun 24, 2019 at 11:13 AM Stefan Hajnoczi <stefanha@redhat.com> wrote:
>
> The vhost-user specification does not explain when
> VHOST_USER_PROTOCOL_F_MQ must be implemented.  This may lead
> implementors of vhost-user masters to believe that this protocol feature
> is required for any device that has multiple virtqueues.  That would be
> a mistake since existing vhost-user slaves offer multiple virtqueues but
> do not advertise VHOST_USER_PROTOCOL_F_MQ.
>
> For example, a vhost-net device with one rx/tx queue pair is not
> multiqueue.  The slave does not need to advertise
> VHOST_USER_PROTOCOL_F_MQ.  Therefore the master must assume it has these
> virtqueues and cannot rely on askingt the slave how many virtqueues
> exist.
>
> Extend the specification to explain the different between true
> multiqueue and regular devices with a fixed virtqueue layout.
>
> Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>

Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>

> ---
> Based-on: <20190621094005.4134-1-stefanha@redhat.com>
> ---
>  docs/interop/vhost-user.rst | 17 +++++++++++++++++
>  1 file changed, 17 insertions(+)
>
> diff --git a/docs/interop/vhost-user.rst b/docs/interop/vhost-user.rst
> index 5750668aba..7827b710aa 100644
> --- a/docs/interop/vhost-user.rst
> +++ b/docs/interop/vhost-user.rst
> @@ -324,6 +324,15 @@ must support changing some configuration aspects on the fly.
>  Multiple queue support
>  ----------------------
>
> +Many devices have a fixed number of virtqueues.  In this case the master
> +already knows the number of available virtqueues without communicating with the
> +slave.
> +
> +Some devices do not have a fixed number of virtqueues.  Instead the maximum
> +number of virtqueues is chosen by the slave.  The number can depend on host
> +resource availability or slave implementation details.  Such devices are called
> +multiple queue devices.
> +
>  Multiple queue support allows the slave to advertise the maximum number of
>  queues.  This is treated as a protocol extension, hence the slave has to
>  implement protocol features first. The multiple queues feature is supported
> @@ -339,6 +348,14 @@ queue in the sent message to identify a specified queue.
>  The master enables queues by sending message ``VHOST_USER_SET_VRING_ENABLE``.
>  vhost-user-net has historically automatically enabled the first queue pair.
>
> +Slaves should always implement the ``VHOST_USER_PROTOCOL_F_MQ`` protocol
> +feature, even for devices with a fixed number of virtqueues, since it is simple
> +to implement and offers a degree of introspection.
> +
> +Masters must not rely on the ``VHOST_USER_PROTOCOL_F_MQ`` protocol feature for
> +devices with a fixed number of virtqueues.  Only true multiqueue devices
> +require this protocol feature.
> +
>  Migration
>  ---------
>
> --
> 2.21.0
>
>


-- 
Marc-André Lureau


^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: [Qemu-devel] [PATCH] docs: clarify multiqueue vs multiple virtqueues
  2019-06-24  9:13 ` [Qemu-devel] [PATCH] docs: clarify multiqueue vs multiple virtqueues Stefan Hajnoczi
  2019-06-24 10:19   ` Marc-André Lureau
@ 2019-07-17 10:14   ` Stefan Hajnoczi
  2019-07-17 10:35     ` Michael S. Tsirkin
  2019-07-25 15:31   ` [Qemu-devel] [PULL 01/12] " Michael S. Tsirkin
  2 siblings, 1 reply; 53+ messages in thread
From: Stefan Hajnoczi @ 2019-07-17 10:14 UTC (permalink / raw)
  To: qemu-devel
  Cc: Marc-André Lureau, Sebastien Boeuf, Cathy Zhang, Michael S. Tsirkin

[-- Attachment #1: Type: text/plain, Size: 2856 bytes --]

On Mon, Jun 24, 2019 at 10:13:04AM +0100, Stefan Hajnoczi wrote:
> The vhost-user specification does not explain when
> VHOST_USER_PROTOCOL_F_MQ must be implemented.  This may lead
> implementors of vhost-user masters to believe that this protocol feature
> is required for any device that has multiple virtqueues.  That would be
> a mistake since existing vhost-user slaves offer multiple virtqueues but
> do not advertise VHOST_USER_PROTOCOL_F_MQ.
> 
> For example, a vhost-net device with one rx/tx queue pair is not
> multiqueue.  The slave does not need to advertise
> VHOST_USER_PROTOCOL_F_MQ.  Therefore the master must assume it has these
> virtqueues and cannot rely on askingt the slave how many virtqueues
> exist.
> 
> Extend the specification to explain the different between true
> multiqueue and regular devices with a fixed virtqueue layout.
> 
> Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
> ---
> Based-on: <20190621094005.4134-1-stefanha@redhat.com>
> ---
>  docs/interop/vhost-user.rst | 17 +++++++++++++++++
>  1 file changed, 17 insertions(+)

Ping?

> 
> diff --git a/docs/interop/vhost-user.rst b/docs/interop/vhost-user.rst
> index 5750668aba..7827b710aa 100644
> --- a/docs/interop/vhost-user.rst
> +++ b/docs/interop/vhost-user.rst
> @@ -324,6 +324,15 @@ must support changing some configuration aspects on the fly.
>  Multiple queue support
>  ----------------------
>  
> +Many devices have a fixed number of virtqueues.  In this case the master
> +already knows the number of available virtqueues without communicating with the
> +slave.
> +
> +Some devices do not have a fixed number of virtqueues.  Instead the maximum
> +number of virtqueues is chosen by the slave.  The number can depend on host
> +resource availability or slave implementation details.  Such devices are called
> +multiple queue devices.
> +
>  Multiple queue support allows the slave to advertise the maximum number of
>  queues.  This is treated as a protocol extension, hence the slave has to
>  implement protocol features first. The multiple queues feature is supported
> @@ -339,6 +348,14 @@ queue in the sent message to identify a specified queue.
>  The master enables queues by sending message ``VHOST_USER_SET_VRING_ENABLE``.
>  vhost-user-net has historically automatically enabled the first queue pair.
>  
> +Slaves should always implement the ``VHOST_USER_PROTOCOL_F_MQ`` protocol
> +feature, even for devices with a fixed number of virtqueues, since it is simple
> +to implement and offers a degree of introspection.
> +
> +Masters must not rely on the ``VHOST_USER_PROTOCOL_F_MQ`` protocol feature for
> +devices with a fixed number of virtqueues.  Only true multiqueue devices
> +require this protocol feature.
> +
>  Migration
>  ---------
>  
> -- 
> 2.21.0
> 

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: [Qemu-devel] [PATCH] docs: clarify multiqueue vs multiple virtqueues
  2019-07-17 10:14   ` Stefan Hajnoczi
@ 2019-07-17 10:35     ` Michael S. Tsirkin
  0 siblings, 0 replies; 53+ messages in thread
From: Michael S. Tsirkin @ 2019-07-17 10:35 UTC (permalink / raw)
  To: Stefan Hajnoczi
  Cc: Marc-André Lureau, Sebastien Boeuf, qemu-devel, Cathy Zhang

On Wed, Jul 17, 2019 at 11:14:53AM +0100, Stefan Hajnoczi wrote:
> On Mon, Jun 24, 2019 at 10:13:04AM +0100, Stefan Hajnoczi wrote:
> > The vhost-user specification does not explain when
> > VHOST_USER_PROTOCOL_F_MQ must be implemented.  This may lead
> > implementors of vhost-user masters to believe that this protocol feature
> > is required for any device that has multiple virtqueues.  That would be
> > a mistake since existing vhost-user slaves offer multiple virtqueues but
> > do not advertise VHOST_USER_PROTOCOL_F_MQ.
> > 
> > For example, a vhost-net device with one rx/tx queue pair is not
> > multiqueue.  The slave does not need to advertise
> > VHOST_USER_PROTOCOL_F_MQ.  Therefore the master must assume it has these
> > virtqueues and cannot rely on askingt the slave how many virtqueues
> > exist.
> > 
> > Extend the specification to explain the different between true
> > multiqueue and regular devices with a fixed virtqueue layout.
> > 
> > Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
> > ---
> > Based-on: <20190621094005.4134-1-stefanha@redhat.com>
> > ---
> >  docs/interop/vhost-user.rst | 17 +++++++++++++++++
> >  1 file changed, 17 insertions(+)
> 
> Ping?

queued, thanks!

> > 
> > diff --git a/docs/interop/vhost-user.rst b/docs/interop/vhost-user.rst
> > index 5750668aba..7827b710aa 100644
> > --- a/docs/interop/vhost-user.rst
> > +++ b/docs/interop/vhost-user.rst
> > @@ -324,6 +324,15 @@ must support changing some configuration aspects on the fly.
> >  Multiple queue support
> >  ----------------------
> >  
> > +Many devices have a fixed number of virtqueues.  In this case the master
> > +already knows the number of available virtqueues without communicating with the
> > +slave.
> > +
> > +Some devices do not have a fixed number of virtqueues.  Instead the maximum
> > +number of virtqueues is chosen by the slave.  The number can depend on host
> > +resource availability or slave implementation details.  Such devices are called
> > +multiple queue devices.
> > +
> >  Multiple queue support allows the slave to advertise the maximum number of
> >  queues.  This is treated as a protocol extension, hence the slave has to
> >  implement protocol features first. The multiple queues feature is supported
> > @@ -339,6 +348,14 @@ queue in the sent message to identify a specified queue.
> >  The master enables queues by sending message ``VHOST_USER_SET_VRING_ENABLE``.
> >  vhost-user-net has historically automatically enabled the first queue pair.
> >  
> > +Slaves should always implement the ``VHOST_USER_PROTOCOL_F_MQ`` protocol
> > +feature, even for devices with a fixed number of virtqueues, since it is simple
> > +to implement and offers a degree of introspection.
> > +
> > +Masters must not rely on the ``VHOST_USER_PROTOCOL_F_MQ`` protocol feature for
> > +devices with a fixed number of virtqueues.  Only true multiqueue devices
> > +require this protocol feature.
> > +
> >  Migration
> >  ---------
> >  
> > -- 
> > 2.21.0
> > 




^ permalink raw reply	[flat|nested] 53+ messages in thread

* [Qemu-devel] [PATCH v2] i386/acpi: fix gint overflow in crs_range_compare
@ 2019-07-18 16:14 ` Evgeny Yakovlev
  2019-07-18 20:30   ` Michael S. Tsirkin
  2019-07-25 15:31   ` [Qemu-devel] [PULL 02/12] " Michael S. Tsirkin
  0 siblings, 2 replies; 53+ messages in thread
From: Evgeny Yakovlev @ 2019-07-18 16:14 UTC (permalink / raw)
  To: qemu-devel, mst; +Cc: yc-core

When very large regions (32GB sized in our case, PCI pass-through of GPUs)
are compared substraction result does not fit into gint.

As a result crs_replace_with_free_ranges does not get sorted ranges and
incorrectly computes PCI64 free space regions. Which then makes linux
guest complain about device and PCI64 hole intersection and device
becomes unusable.

Fix that by returning exactly fitting ranges.

Also fix indentation of an entire crs_replace_with_free_ranges to make
checkpatch happy.

Signed-off-by: Evgeny Yakovlev <wrfsh@yandex-team.ru>
---
v2:
entire crs_replace_with_free_ranges was indented with 5 spaces, including my change.
fix that as well

 hw/i386/acpi-build.c | 12 +++++++++---
 1 file changed, 9 insertions(+), 3 deletions(-)

diff --git a/hw/i386/acpi-build.c b/hw/i386/acpi-build.c
index d281ffa..e7b756b 100644
--- a/hw/i386/acpi-build.c
+++ b/hw/i386/acpi-build.c
@@ -755,10 +755,16 @@ static void crs_range_set_free(CrsRangeSet *range_set)
 
 static gint crs_range_compare(gconstpointer a, gconstpointer b)
 {
-     CrsRangeEntry *entry_a = *(CrsRangeEntry **)a;
-     CrsRangeEntry *entry_b = *(CrsRangeEntry **)b;
+    CrsRangeEntry *entry_a = *(CrsRangeEntry **)a;
+    CrsRangeEntry *entry_b = *(CrsRangeEntry **)b;
 
-     return (int64_t)entry_a->base - (int64_t)entry_b->base;
+    if (entry_a->base < entry_b->base) {
+        return -1;
+    } else if (entry_a->base > entry_b->base) {
+        return 1;
+    } else {
+        return 0;
+    }
 }
 
 /*
-- 
2.7.4



^ permalink raw reply related	[flat|nested] 53+ messages in thread

* Re: [Qemu-devel] [PATCH v2] i386/acpi: fix gint overflow in crs_range_compare
  2019-07-18 16:14 ` [Qemu-devel] [PATCH v2] i386/acpi: fix gint overflow in crs_range_compare Evgeny Yakovlev
@ 2019-07-18 20:30   ` Michael S. Tsirkin
  2019-07-25 15:31   ` [Qemu-devel] [PULL 02/12] " Michael S. Tsirkin
  1 sibling, 0 replies; 53+ messages in thread
From: Michael S. Tsirkin @ 2019-07-18 20:30 UTC (permalink / raw)
  To: Evgeny Yakovlev; +Cc: qemu-devel, yc-core

On Thu, Jul 18, 2019 at 07:14:23PM +0300, Evgeny Yakovlev wrote:
> When very large regions (32GB sized in our case, PCI pass-through of GPUs)
> are compared substraction result does not fit into gint.
> 
> As a result crs_replace_with_free_ranges does not get sorted ranges and
> incorrectly computes PCI64 free space regions. Which then makes linux
> guest complain about device and PCI64 hole intersection and device
> becomes unusable.
> 
> Fix that by returning exactly fitting ranges.
> 
> Also fix indentation of an entire crs_replace_with_free_ranges to make
> checkpatch happy.
> 
> Signed-off-by: Evgeny Yakovlev <wrfsh@yandex-team.ru>

queued, thanks a lot!

> ---
> v2:
> entire crs_replace_with_free_ranges was indented with 5 spaces, including my change.
> fix that as well
> 
>  hw/i386/acpi-build.c | 12 +++++++++---
>  1 file changed, 9 insertions(+), 3 deletions(-)
> 
> diff --git a/hw/i386/acpi-build.c b/hw/i386/acpi-build.c
> index d281ffa..e7b756b 100644
> --- a/hw/i386/acpi-build.c
> +++ b/hw/i386/acpi-build.c
> @@ -755,10 +755,16 @@ static void crs_range_set_free(CrsRangeSet *range_set)
>  
>  static gint crs_range_compare(gconstpointer a, gconstpointer b)
>  {
> -     CrsRangeEntry *entry_a = *(CrsRangeEntry **)a;
> -     CrsRangeEntry *entry_b = *(CrsRangeEntry **)b;
> +    CrsRangeEntry *entry_a = *(CrsRangeEntry **)a;
> +    CrsRangeEntry *entry_b = *(CrsRangeEntry **)b;
>  
> -     return (int64_t)entry_a->base - (int64_t)entry_b->base;
> +    if (entry_a->base < entry_b->base) {
> +        return -1;
> +    } else if (entry_a->base > entry_b->base) {
> +        return 1;
> +    } else {
> +        return 0;
> +    }
>  }
>  
>  /*
> -- 
> 2.7.4


^ permalink raw reply	[flat|nested] 53+ messages in thread

* [Qemu-devel] [PATCH] i386/acpi: show PCI Express bus on pxb-pcie expanders
@ 2019-07-19  8:54 ` Evgeny Yakovlev
  2019-07-19 12:14   ` Igor Mammedov
  2019-07-25 15:31   ` [Qemu-devel] [PULL 04/12] " Michael S. Tsirkin
  0 siblings, 2 replies; 53+ messages in thread
From: Evgeny Yakovlev @ 2019-07-19  8:54 UTC (permalink / raw)
  To: qemu-devel, mst; +Cc: yc-core

Show PCIe host bridge PNP id with PCI host bridge as a compatible id
when expanding a pcie bus.

Signed-off-by: Evgeny Yakovlev <wrfsh@yandex-team.ru>
---
 hw/i386/acpi-build.c | 5 ++++-
 1 file changed, 4 insertions(+), 1 deletion(-)

diff --git a/hw/i386/acpi-build.c b/hw/i386/acpi-build.c
index d281ffa..0675952 100644
--- a/hw/i386/acpi-build.c
+++ b/hw/i386/acpi-build.c
@@ -1908,10 +1908,13 @@ build_dsdt(GArray *table_data, BIOSLinker *linker,
             scope = aml_scope("\\_SB");
             dev = aml_device("PC%.02X", bus_num);
             aml_append(dev, aml_name_decl("_UID", aml_int(bus_num)));
-            aml_append(dev, aml_name_decl("_HID", aml_eisaid("PNP0A03")));
             aml_append(dev, aml_name_decl("_BBN", aml_int(bus_num)));
             if (pci_bus_is_express(bus)) {
+                aml_append(dev, aml_name_decl("_HID", aml_eisaid("PNP0A08")));
+                aml_append(dev, aml_name_decl("_CID", aml_eisaid("PNP0A03")));
                 aml_append(dev, build_q35_osc_method());
+            } else {
+                aml_append(dev, aml_name_decl("_HID", aml_eisaid("PNP0A03")));
             }
 
             if (numa_node != NUMA_NODE_UNASSIGNED) {
-- 
2.7.4



^ permalink raw reply related	[flat|nested] 53+ messages in thread

* Re: [Qemu-devel] [PATCH] i386/acpi: show PCI Express bus on pxb-pcie expanders
  2019-07-19  8:54 ` [Qemu-devel] [PATCH] i386/acpi: show PCI Express bus on pxb-pcie expanders Evgeny Yakovlev
@ 2019-07-19 12:14   ` Igor Mammedov
  2019-07-25 15:31   ` [Qemu-devel] [PULL 04/12] " Michael S. Tsirkin
  1 sibling, 0 replies; 53+ messages in thread
From: Igor Mammedov @ 2019-07-19 12:14 UTC (permalink / raw)
  To: Evgeny Yakovlev; +Cc: qemu-devel, yc-core, mst

On Fri, 19 Jul 2019 11:54:29 +0300
Evgeny Yakovlev <wrfsh@yandex-team.ru> wrote:

> Show PCIe host bridge PNP id with PCI host bridge as a compatible id
> when expanding a pcie bus.
> 
> Signed-off-by: Evgeny Yakovlev <wrfsh@yandex-team.ru>

Reviewed-by: Igor Mammedov <imammedo@redhat.com>

> ---
>  hw/i386/acpi-build.c | 5 ++++-
>  1 file changed, 4 insertions(+), 1 deletion(-)
> 
> diff --git a/hw/i386/acpi-build.c b/hw/i386/acpi-build.c
> index d281ffa..0675952 100644
> --- a/hw/i386/acpi-build.c
> +++ b/hw/i386/acpi-build.c
> @@ -1908,10 +1908,13 @@ build_dsdt(GArray *table_data, BIOSLinker *linker,
>              scope = aml_scope("\\_SB");
>              dev = aml_device("PC%.02X", bus_num);
>              aml_append(dev, aml_name_decl("_UID", aml_int(bus_num)));
> -            aml_append(dev, aml_name_decl("_HID", aml_eisaid("PNP0A03")));
>              aml_append(dev, aml_name_decl("_BBN", aml_int(bus_num)));
>              if (pci_bus_is_express(bus)) {
> +                aml_append(dev, aml_name_decl("_HID", aml_eisaid("PNP0A08")));
> +                aml_append(dev, aml_name_decl("_CID", aml_eisaid("PNP0A03")));
>                  aml_append(dev, build_q35_osc_method());
> +            } else {
> +                aml_append(dev, aml_name_decl("_HID", aml_eisaid("PNP0A03")));
>              }
>  
>              if (numa_node != NUMA_NODE_UNASSIGNED) {



^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: [Qemu-devel] [PATCH] ioapic: kvm: Skip route updates for masked pins
  2019-06-03  0:36   ` Michael S. Tsirkin
@ 2019-07-21  8:58     ` Jan Kiszka
  2019-07-21 10:04       ` Michael S. Tsirkin
  0 siblings, 1 reply; 53+ messages in thread
From: Jan Kiszka @ 2019-07-21  8:58 UTC (permalink / raw)
  To: Michael S. Tsirkin, Paolo Bonzini; +Cc: qemu-devel, Peter Xu

On 03.06.19 02:36, Michael S. Tsirkin wrote:
> On Sun, Jun 02, 2019 at 01:42:13PM +0200, Jan Kiszka wrote:
>> From: Jan Kiszka <jan.kiszka@siemens.com>
>>
>> Masked entries will not generate interrupt messages, thus do no need to
>> be routed by KVM. This is a cosmetic cleanup, just avoiding warnings of
>> the kind
>>
>> qemu-system-x86_64: vtd_irte_get: detected non-present IRTE (index=0, high=0xff00, low=0x100)
>>
>> if the masked entry happens to reference a non-present IRTE.
>>
>> Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
>
> Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
>
>> ---
>>  hw/intc/ioapic.c | 8 +++++---
>>  1 file changed, 5 insertions(+), 3 deletions(-)
>>
>> diff --git a/hw/intc/ioapic.c b/hw/intc/ioapic.c
>> index 7074489fdf..2fb288a22d 100644
>> --- a/hw/intc/ioapic.c
>> +++ b/hw/intc/ioapic.c
>> @@ -197,9 +197,11 @@ static void ioapic_update_kvm_routes(IOAPICCommonState *s)
>>              MSIMessage msg;
>>              struct ioapic_entry_info info;
>>              ioapic_entry_parse(s->ioredtbl[i], &info);
>> -            msg.address = info.addr;
>> -            msg.data = info.data;
>> -            kvm_irqchip_update_msi_route(kvm_state, i, msg, NULL);
>> +            if (!info.masked) {
>> +                msg.address = info.addr;
>> +                msg.data = info.data;
>> +                kvm_irqchip_update_msi_route(kvm_state, i, msg, NULL);
>> +            }
>>          }
>>          kvm_irqchip_commit_routes(kvm_state);
>>      }
>> --
>> 2.16.4
>
>

Ping. Or is this queued for 4.2?

Jan


^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: [Qemu-devel] [PATCH] ioapic: kvm: Skip route updates for masked pins
  2019-07-21  8:58     ` Jan Kiszka
@ 2019-07-21 10:04       ` Michael S. Tsirkin
  2019-07-21 16:55         ` Paolo Bonzini
  0 siblings, 1 reply; 53+ messages in thread
From: Michael S. Tsirkin @ 2019-07-21 10:04 UTC (permalink / raw)
  To: Jan Kiszka; +Cc: Paolo Bonzini, qemu-devel, Peter Xu

On Sun, Jul 21, 2019 at 10:58:42AM +0200, Jan Kiszka wrote:
> On 03.06.19 02:36, Michael S. Tsirkin wrote:
> > On Sun, Jun 02, 2019 at 01:42:13PM +0200, Jan Kiszka wrote:
> >> From: Jan Kiszka <jan.kiszka@siemens.com>
> >>
> >> Masked entries will not generate interrupt messages, thus do no need to
> >> be routed by KVM. This is a cosmetic cleanup, just avoiding warnings of
> >> the kind
> >>
> >> qemu-system-x86_64: vtd_irte_get: detected non-present IRTE (index=0, high=0xff00, low=0x100)
> >>
> >> if the masked entry happens to reference a non-present IRTE.
> >>
> >> Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
> >
> > Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
> >
> >> ---
> >>  hw/intc/ioapic.c | 8 +++++---
> >>  1 file changed, 5 insertions(+), 3 deletions(-)
> >>
> >> diff --git a/hw/intc/ioapic.c b/hw/intc/ioapic.c
> >> index 7074489fdf..2fb288a22d 100644
> >> --- a/hw/intc/ioapic.c
> >> +++ b/hw/intc/ioapic.c
> >> @@ -197,9 +197,11 @@ static void ioapic_update_kvm_routes(IOAPICCommonState *s)
> >>              MSIMessage msg;
> >>              struct ioapic_entry_info info;
> >>              ioapic_entry_parse(s->ioredtbl[i], &info);
> >> -            msg.address = info.addr;
> >> -            msg.data = info.data;
> >> -            kvm_irqchip_update_msi_route(kvm_state, i, msg, NULL);
> >> +            if (!info.masked) {
> >> +                msg.address = info.addr;
> >> +                msg.data = info.data;
> >> +                kvm_irqchip_update_msi_route(kvm_state, i, msg, NULL);
> >> +            }
> >>          }
> >>          kvm_irqchip_commit_routes(kvm_state);
> >>      }
> >> --
> >> 2.16.4
> >
> >
> 
> Ping. Or is this queued for 4.2?
> 
> Jan

Paolo was queueing ioapic things recently. I can take this if
he doesn't respond until I pack up my next pull req.

-- 
MST


^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: [Qemu-devel] [PATCH] ioapic: kvm: Skip route updates for masked pins
  2019-07-21 10:04       ` Michael S. Tsirkin
@ 2019-07-21 16:55         ` Paolo Bonzini
  0 siblings, 0 replies; 53+ messages in thread
From: Paolo Bonzini @ 2019-07-21 16:55 UTC (permalink / raw)
  To: Michael S. Tsirkin, Jan Kiszka; +Cc: qemu-devel, Peter Xu

On 21/07/19 12:04, Michael S. Tsirkin wrote:
> On Sun, Jul 21, 2019 at 10:58:42AM +0200, Jan Kiszka wrote:
>> On 03.06.19 02:36, Michael S. Tsirkin wrote:
>>> On Sun, Jun 02, 2019 at 01:42:13PM +0200, Jan Kiszka wrote:
>>>> From: Jan Kiszka <jan.kiszka@siemens.com>
>>>>
>>>> Masked entries will not generate interrupt messages, thus do no need to
>>>> be routed by KVM. This is a cosmetic cleanup, just avoiding warnings of
>>>> the kind
>>>>
>>>> qemu-system-x86_64: vtd_irte_get: detected non-present IRTE (index=0, high=0xff00, low=0x100)
>>>>
>>>> if the masked entry happens to reference a non-present IRTE.
>>>>
>>>> Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
>>>
>>> Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
>>>
>>>> ---
>>>>  hw/intc/ioapic.c | 8 +++++---
>>>>  1 file changed, 5 insertions(+), 3 deletions(-)
>>>>
>>>> diff --git a/hw/intc/ioapic.c b/hw/intc/ioapic.c
>>>> index 7074489fdf..2fb288a22d 100644
>>>> --- a/hw/intc/ioapic.c
>>>> +++ b/hw/intc/ioapic.c
>>>> @@ -197,9 +197,11 @@ static void ioapic_update_kvm_routes(IOAPICCommonState *s)
>>>>              MSIMessage msg;
>>>>              struct ioapic_entry_info info;
>>>>              ioapic_entry_parse(s->ioredtbl[i], &info);
>>>> -            msg.address = info.addr;
>>>> -            msg.data = info.data;
>>>> -            kvm_irqchip_update_msi_route(kvm_state, i, msg, NULL);
>>>> +            if (!info.masked) {
>>>> +                msg.address = info.addr;
>>>> +                msg.data = info.data;
>>>> +                kvm_irqchip_update_msi_route(kvm_state, i, msg, NULL);
>>>> +            }
>>>>          }
>>>>          kvm_irqchip_commit_routes(kvm_state);
>>>>      }
>>>> --
>>>> 2.16.4
>>>
>>>
>>
>> Ping. Or is this queued for 4.2?
>>
>> Jan
> 
> Paolo was queueing ioapic things recently. I can take this if
> he doesn't respond until I pack up my next pull req.
> 

I've already sent a pull request and had missed this patch, so please go
ahead.

Paolo


^ permalink raw reply	[flat|nested] 53+ messages in thread

* [Qemu-devel] [PATCH-for-4.1 v3 0/6] virtio-balloon: fixes
@ 2019-07-22 13:41 David Hildenbrand
  2019-07-22 13:41 ` [Qemu-devel] [PATCH-for-4.1 v3 1/6] virtio-balloon: Fix wrong sign extension of PFNs David Hildenbrand
                   ` (5 more replies)
  0 siblings, 6 replies; 53+ messages in thread
From: David Hildenbrand @ 2019-07-22 13:41 UTC (permalink / raw)
  To: qemu-devel
  Cc: Michael S . Tsirkin, David Hildenbrand, qemu-stable,
	Stefan Hajnoczi, Igor Mammedov, David Gibson

As we discovered yet another issue with current PBP code, we decided
to use a local PBP only, limited to one series of inflation requests.
This series supersedes:
- [PATCH-for-4.1 v2 0/3] virtio-balloon: fixes for PartialBalloonedPage
- [PATCH v1 0/3] virtio-balloon: PartialBalloonedPage rework

Patch #1 is a fix for a wrong sign extension (MST brought this up but
wasn't sure if it is broen - I think it is indeed broken). Patch #2 fixed
QEMU segfaults. Patch #3 and #4 are cleanups that make follow-up fixes
easier. Patch #5 avoids using RAMBlock addresses as tokens and patch #6
fixes all kinds of issues related to using a global PBP.

We want to have all patches in 4.1 and backport them to 4.0. Patch #1
needs backports to basically all QEMU releases with virtio-balloon.

Did a quick sanity test, hopefully no other BUG sneeked in. Will do some
more testing.

Cc: Stefan Hajnoczi <stefanha@redhat.com>
Cc: David Gibson <david@gibson.dropbear.id.au>
Cc: Michael S. Tsirkin <mst@redhat.com>
Cc: Igor Mammedov <imammedo@redhat.com>
Cc: qemu-stable@nongnu.org

David Hildenbrand (6):
  virtio-balloon: Fix wrong sign extension of PFNs
  virtio-balloon: Fix QEMU crashes on pagesize > BALLOON_PAGE_SIZE
  virtio-balloon: Simplify deflate with pbp
  virtio-balloon: Better names for offset variables in inflate/deflate
    code
  virtio-balloon: Rework pbp tracking data
  virtio-balloon: Use temporary PBP only

 hw/virtio/virtio-balloon.c         | 122 +++++++++++++++--------------
 include/hw/virtio/virtio-balloon.h |   3 -
 2 files changed, 62 insertions(+), 63 deletions(-)

-- 
2.21.0



^ permalink raw reply	[flat|nested] 53+ messages in thread

* [Qemu-devel] [PATCH-for-4.1 v3 1/6] virtio-balloon: Fix wrong sign extension of PFNs
  2019-07-22 13:41 [Qemu-devel] [PATCH-for-4.1 v3 0/6] virtio-balloon: fixes David Hildenbrand
@ 2019-07-22 13:41 ` David Hildenbrand
  2019-07-23  2:27   ` David Gibson
  2019-07-25 15:31   ` [Qemu-devel] [PULL 05/12] " Michael S. Tsirkin
  2019-07-22 13:41 ` [Qemu-devel] [PATCH-for-4.1 v3 2/6] virtio-balloon: Fix QEMU crashes on pagesize > BALLOON_PAGE_SIZE David Hildenbrand
                   ` (4 subsequent siblings)
  5 siblings, 2 replies; 53+ messages in thread
From: David Hildenbrand @ 2019-07-22 13:41 UTC (permalink / raw)
  To: qemu-devel
  Cc: Michael S . Tsirkin, David Hildenbrand, qemu-stable,
	Stefan Hajnoczi, Igor Mammedov, David Gibson

If we directly cast from int to uint64_t, we will first sign-extend to
an int64_t, which is wrong. We actually want to treat the PFNs like
unsigned values.

As far as I can see, this dates back to the initial virtio-balloon
commit, but wasn't triggered as fairly big guests would be required.

Cc: qemu-stable@nongnu.org
Reported-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: David Hildenbrand <david@redhat.com>
---
 hw/virtio/virtio-balloon.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/hw/virtio/virtio-balloon.c b/hw/virtio/virtio-balloon.c
index e85d1c0d5c..515abf6553 100644
--- a/hw/virtio/virtio-balloon.c
+++ b/hw/virtio/virtio-balloon.c
@@ -343,8 +343,8 @@ static void virtio_balloon_handle_output(VirtIODevice *vdev, VirtQueue *vq)
         }
 
         while (iov_to_buf(elem->out_sg, elem->out_num, offset, &pfn, 4) == 4) {
+            unsigned int p = virtio_ldl_p(vdev, &pfn);
             hwaddr pa;
-            int p = virtio_ldl_p(vdev, &pfn);
 
             pa = (hwaddr) p << VIRTIO_BALLOON_PFN_SHIFT;
             offset += 4;
-- 
2.21.0



^ permalink raw reply related	[flat|nested] 53+ messages in thread

* [Qemu-devel] [PATCH-for-4.1 v3 2/6] virtio-balloon: Fix QEMU crashes on pagesize > BALLOON_PAGE_SIZE
  2019-07-22 13:41 [Qemu-devel] [PATCH-for-4.1 v3 0/6] virtio-balloon: fixes David Hildenbrand
  2019-07-22 13:41 ` [Qemu-devel] [PATCH-for-4.1 v3 1/6] virtio-balloon: Fix wrong sign extension of PFNs David Hildenbrand
@ 2019-07-22 13:41 ` David Hildenbrand
  2019-07-25 15:31   ` [Qemu-devel] [PULL 06/12] " Michael S. Tsirkin
  2019-07-22 13:41 ` [Qemu-devel] [PATCH-for-4.1 v3 3/6] virtio-balloon: Simplify deflate with pbp David Hildenbrand
                   ` (3 subsequent siblings)
  5 siblings, 1 reply; 53+ messages in thread
From: David Hildenbrand @ 2019-07-22 13:41 UTC (permalink / raw)
  To: qemu-devel
  Cc: Michael S . Tsirkin, David Hildenbrand, qemu-stable,
	Stefan Hajnoczi, Igor Mammedov, David Gibson

We are using the wrong functions to set/clear bits, effectively touching
multiple bits, writing out of range of the bitmap, resulting in memory
corruptions. We have to use set_bit()/clear_bit() instead.

Can easily be reproduced by starting a qemu guest on hugetlbfs memory,
inflating the balloon. QEMU crashes. This never could have worked
properly - especially, also pages would have been discarded when the
first sub-page would be inflated (the whole bitmap would be set).

While testing I realized, that on hugetlbfs it is pretty much impossible
to discard a page - the guest just frees the 4k sub-pages in random order
most of the time. I was only able to discard a hugepage a handful of
times - so I hope that now works correctly.

Fixes: ed48c59875b6 ("virtio-balloon: Safely handle BALLOON_PAGE_SIZE <
                     host page size")
Fixes: b27b32391404 ("virtio-balloon: Fix possible guest memory corruption
                     with inflates & deflates")
Cc: qemu-stable@nongnu.org #v4.0.0
Acked-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: David Hildenbrand <david@redhat.com>
---
 hw/virtio/virtio-balloon.c | 10 ++++------
 1 file changed, 4 insertions(+), 6 deletions(-)

diff --git a/hw/virtio/virtio-balloon.c b/hw/virtio/virtio-balloon.c
index 515abf6553..a78d2d2184 100644
--- a/hw/virtio/virtio-balloon.c
+++ b/hw/virtio/virtio-balloon.c
@@ -94,9 +94,8 @@ static void balloon_inflate_page(VirtIOBalloon *balloon,
         balloon->pbp->base = host_page_base;
     }
 
-    bitmap_set(balloon->pbp->bitmap,
-               (ram_offset - balloon->pbp->base) / BALLOON_PAGE_SIZE,
-               subpages);
+    set_bit((ram_offset - balloon->pbp->base) / BALLOON_PAGE_SIZE,
+            balloon->pbp->bitmap);
 
     if (bitmap_full(balloon->pbp->bitmap, subpages)) {
         /* We've accumulated a full host page, we can actually discard
@@ -140,9 +139,8 @@ static void balloon_deflate_page(VirtIOBalloon *balloon,
          * for a guest to do this in practice, but handle it anyway,
          * since getting it wrong could mean discarding memory the
          * guest is still using. */
-        bitmap_clear(balloon->pbp->bitmap,
-                     (ram_offset - balloon->pbp->base) / BALLOON_PAGE_SIZE,
-                     subpages);
+        clear_bit((ram_offset - balloon->pbp->base) / BALLOON_PAGE_SIZE,
+                  balloon->pbp->bitmap);
 
         if (bitmap_empty(balloon->pbp->bitmap, subpages)) {
             g_free(balloon->pbp);
-- 
2.21.0



^ permalink raw reply related	[flat|nested] 53+ messages in thread

* [Qemu-devel] [PATCH-for-4.1 v3 3/6] virtio-balloon: Simplify deflate with pbp
  2019-07-22 13:41 [Qemu-devel] [PATCH-for-4.1 v3 0/6] virtio-balloon: fixes David Hildenbrand
  2019-07-22 13:41 ` [Qemu-devel] [PATCH-for-4.1 v3 1/6] virtio-balloon: Fix wrong sign extension of PFNs David Hildenbrand
  2019-07-22 13:41 ` [Qemu-devel] [PATCH-for-4.1 v3 2/6] virtio-balloon: Fix QEMU crashes on pagesize > BALLOON_PAGE_SIZE David Hildenbrand
@ 2019-07-22 13:41 ` David Hildenbrand
  2019-07-25 15:32   ` [Qemu-devel] [PULL 07/12] " Michael S. Tsirkin
  2019-07-22 13:41 ` [Qemu-devel] [PATCH-for-4.1 v3 4/6] virtio-balloon: Better names for offset variables in inflate/deflate code David Hildenbrand
                   ` (2 subsequent siblings)
  5 siblings, 1 reply; 53+ messages in thread
From: David Hildenbrand @ 2019-07-22 13:41 UTC (permalink / raw)
  To: qemu-devel
  Cc: Michael S . Tsirkin, David Hildenbrand, qemu-stable,
	Stefan Hajnoczi, Igor Mammedov, David Gibson

Let's simplify this - the case we are optimizing for is very hard to
trigger and not worth the effort. If we're switching from inflation to
deflation, let's reset the pbp.

Acked-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: David Hildenbrand <david@redhat.com>
---
 hw/virtio/virtio-balloon.c | 26 +++++---------------------
 1 file changed, 5 insertions(+), 21 deletions(-)

diff --git a/hw/virtio/virtio-balloon.c b/hw/virtio/virtio-balloon.c
index a78d2d2184..04a7e6c772 100644
--- a/hw/virtio/virtio-balloon.c
+++ b/hw/virtio/virtio-balloon.c
@@ -117,7 +117,7 @@ static void balloon_deflate_page(VirtIOBalloon *balloon,
     void *addr = memory_region_get_ram_ptr(mr) + offset;
     RAMBlock *rb;
     size_t rb_page_size;
-    ram_addr_t ram_offset, host_page_base;
+    ram_addr_t ram_offset;
     void *host_addr;
     int ret;
 
@@ -125,27 +125,11 @@ static void balloon_deflate_page(VirtIOBalloon *balloon,
      * host address? */
     rb = qemu_ram_block_from_host(addr, false, &ram_offset);
     rb_page_size = qemu_ram_pagesize(rb);
-    host_page_base = ram_offset & ~(rb_page_size - 1);
-
-    if (balloon->pbp
-        && rb == balloon->pbp->rb
-        && host_page_base == balloon->pbp->base) {
-        int subpages = rb_page_size / BALLOON_PAGE_SIZE;
 
-        /*
-         * This means the guest has asked to discard some of the 4kiB
-         * subpages of a host page, but then changed its mind and
-         * asked to keep them after all.  It's exceedingly unlikely
-         * for a guest to do this in practice, but handle it anyway,
-         * since getting it wrong could mean discarding memory the
-         * guest is still using. */
-        clear_bit((ram_offset - balloon->pbp->base) / BALLOON_PAGE_SIZE,
-                  balloon->pbp->bitmap);
-
-        if (bitmap_empty(balloon->pbp->bitmap, subpages)) {
-            g_free(balloon->pbp);
-            balloon->pbp = NULL;
-        }
+    if (balloon->pbp) {
+        /* Let's play safe and always reset the pbp on deflation requests. */
+        g_free(balloon->pbp);
+        balloon->pbp = NULL;
     }
 
     host_addr = (void *)((uintptr_t)addr & ~(rb_page_size - 1));
-- 
2.21.0



^ permalink raw reply related	[flat|nested] 53+ messages in thread

* [Qemu-devel] [PATCH-for-4.1 v3 4/6] virtio-balloon: Better names for offset variables in inflate/deflate code
  2019-07-22 13:41 [Qemu-devel] [PATCH-for-4.1 v3 0/6] virtio-balloon: fixes David Hildenbrand
                   ` (2 preceding siblings ...)
  2019-07-22 13:41 ` [Qemu-devel] [PATCH-for-4.1 v3 3/6] virtio-balloon: Simplify deflate with pbp David Hildenbrand
@ 2019-07-22 13:41 ` David Hildenbrand
  2019-07-25 15:32   ` [Qemu-devel] [PULL 08/12] " Michael S. Tsirkin
  2019-07-22 13:41 ` [Qemu-devel] [PATCH-for-4.1 v3 5/6] virtio-balloon: Rework pbp tracking data David Hildenbrand
  2019-07-22 13:41 ` [Qemu-devel] [PATCH-for-4.1 v3 6/6] virtio-balloon: Use temporary PBP only David Hildenbrand
  5 siblings, 1 reply; 53+ messages in thread
From: David Hildenbrand @ 2019-07-22 13:41 UTC (permalink / raw)
  To: qemu-devel
  Cc: Michael S . Tsirkin, David Hildenbrand, qemu-stable,
	Stefan Hajnoczi, Igor Mammedov, David Gibson

"host_page_base" is really confusing, let's make this clearer, also
rename the other offsets to indicate to which base they apply.

offset -> mr_offset
ram_offset -> rb_offset
host_page_base -> rb_aligned_offset

While at it, use QEMU_ALIGN_DOWN() instead of a handcrafted computation
and move the computation to the place where it is needed.

Acked-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: David Hildenbrand <david@redhat.com>
---
 hw/virtio/virtio-balloon.c | 26 +++++++++++++-------------
 1 file changed, 13 insertions(+), 13 deletions(-)

diff --git a/hw/virtio/virtio-balloon.c b/hw/virtio/virtio-balloon.c
index 04a7e6c772..f206cc8bf7 100644
--- a/hw/virtio/virtio-balloon.c
+++ b/hw/virtio/virtio-balloon.c
@@ -41,24 +41,23 @@ struct PartiallyBalloonedPage {
 };
 
 static void balloon_inflate_page(VirtIOBalloon *balloon,
-                                 MemoryRegion *mr, hwaddr offset)
+                                 MemoryRegion *mr, hwaddr mr_offset)
 {
-    void *addr = memory_region_get_ram_ptr(mr) + offset;
+    void *addr = memory_region_get_ram_ptr(mr) + mr_offset;
+    ram_addr_t rb_offset, rb_aligned_offset;
     RAMBlock *rb;
     size_t rb_page_size;
     int subpages;
-    ram_addr_t ram_offset, host_page_base;
 
     /* XXX is there a better way to get to the RAMBlock than via a
      * host address? */
-    rb = qemu_ram_block_from_host(addr, false, &ram_offset);
+    rb = qemu_ram_block_from_host(addr, false, &rb_offset);
     rb_page_size = qemu_ram_pagesize(rb);
-    host_page_base = ram_offset & ~(rb_page_size - 1);
 
     if (rb_page_size == BALLOON_PAGE_SIZE) {
         /* Easy case */
 
-        ram_block_discard_range(rb, ram_offset, rb_page_size);
+        ram_block_discard_range(rb, rb_offset, rb_page_size);
         /* We ignore errors from ram_block_discard_range(), because it
          * has already reported them, and failing to discard a balloon
          * page is not fatal */
@@ -74,11 +73,12 @@ static void balloon_inflate_page(VirtIOBalloon *balloon,
     warn_report_once(
 "Balloon used with backing page size > 4kiB, this may not be reliable");
 
+    rb_aligned_offset = QEMU_ALIGN_DOWN(rb_offset, rb_page_size);
     subpages = rb_page_size / BALLOON_PAGE_SIZE;
 
     if (balloon->pbp
         && (rb != balloon->pbp->rb
-            || host_page_base != balloon->pbp->base)) {
+            || rb_aligned_offset != balloon->pbp->base)) {
         /* We've partially ballooned part of a host page, but now
          * we're trying to balloon part of a different one.  Too hard,
          * give up on the old partial page */
@@ -91,10 +91,10 @@ static void balloon_inflate_page(VirtIOBalloon *balloon,
         size_t bitlen = BITS_TO_LONGS(subpages) * sizeof(unsigned long);
         balloon->pbp = g_malloc0(sizeof(PartiallyBalloonedPage) + bitlen);
         balloon->pbp->rb = rb;
-        balloon->pbp->base = host_page_base;
+        balloon->pbp->base = rb_aligned_offset;
     }
 
-    set_bit((ram_offset - balloon->pbp->base) / BALLOON_PAGE_SIZE,
+    set_bit((rb_offset - balloon->pbp->base) / BALLOON_PAGE_SIZE,
             balloon->pbp->bitmap);
 
     if (bitmap_full(balloon->pbp->bitmap, subpages)) {
@@ -112,18 +112,18 @@ static void balloon_inflate_page(VirtIOBalloon *balloon,
 }
 
 static void balloon_deflate_page(VirtIOBalloon *balloon,
-                                 MemoryRegion *mr, hwaddr offset)
+                                 MemoryRegion *mr, hwaddr mr_offset)
 {
-    void *addr = memory_region_get_ram_ptr(mr) + offset;
+    void *addr = memory_region_get_ram_ptr(mr) + mr_offset;
+    ram_addr_t rb_offset;
     RAMBlock *rb;
     size_t rb_page_size;
-    ram_addr_t ram_offset;
     void *host_addr;
     int ret;
 
     /* XXX is there a better way to get to the RAMBlock than via a
      * host address? */
-    rb = qemu_ram_block_from_host(addr, false, &ram_offset);
+    rb = qemu_ram_block_from_host(addr, false, &rb_offset);
     rb_page_size = qemu_ram_pagesize(rb);
 
     if (balloon->pbp) {
-- 
2.21.0



^ permalink raw reply related	[flat|nested] 53+ messages in thread

* [Qemu-devel] [PATCH-for-4.1 v3 5/6] virtio-balloon: Rework pbp tracking data
  2019-07-22 13:41 [Qemu-devel] [PATCH-for-4.1 v3 0/6] virtio-balloon: fixes David Hildenbrand
                   ` (3 preceding siblings ...)
  2019-07-22 13:41 ` [Qemu-devel] [PATCH-for-4.1 v3 4/6] virtio-balloon: Better names for offset variables in inflate/deflate code David Hildenbrand
@ 2019-07-22 13:41 ` David Hildenbrand
  2019-07-23  2:54   ` David Gibson
  2019-07-25 15:32   ` [Qemu-devel] [PULL 09/12] " Michael S. Tsirkin
  2019-07-22 13:41 ` [Qemu-devel] [PATCH-for-4.1 v3 6/6] virtio-balloon: Use temporary PBP only David Hildenbrand
  5 siblings, 2 replies; 53+ messages in thread
From: David Hildenbrand @ 2019-07-22 13:41 UTC (permalink / raw)
  To: qemu-devel
  Cc: Michael S . Tsirkin, David Hildenbrand, qemu-stable,
	Stefan Hajnoczi, Igor Mammedov, David Gibson

Using the address of a RAMBlock to test for a matching pbp is not really
safe. Instead, let's use the guest physical address of the base page
along with the page size (via the number of subpages).

Also, let's allocate the bitmap separately. This makes the code
easier to read and maintain - we can reuse bitmap_new().

Prepare the code to move the PBP out of the device.

Fixes: ed48c59875b6 ("virtio-balloon: Safely handle BALLOON_PAGE_SIZE <
                     host page size")
Fixes: b27b32391404 ("virtio-balloon: Fix possible guest memory corruption
                     with inflates & deflates")
Cc: qemu-stable@nongnu.org #v4.0.0
Signed-off-by: David Hildenbrand <david@redhat.com>
---
 hw/virtio/virtio-balloon.c | 69 +++++++++++++++++++++++++-------------
 1 file changed, 46 insertions(+), 23 deletions(-)

diff --git a/hw/virtio/virtio-balloon.c b/hw/virtio/virtio-balloon.c
index f206cc8bf7..40d493a31a 100644
--- a/hw/virtio/virtio-balloon.c
+++ b/hw/virtio/virtio-balloon.c
@@ -35,16 +35,44 @@
 #define BALLOON_PAGE_SIZE  (1 << VIRTIO_BALLOON_PFN_SHIFT)
 
 struct PartiallyBalloonedPage {
-    RAMBlock *rb;
-    ram_addr_t base;
-    unsigned long bitmap[];
+    ram_addr_t base_gpa;
+    long subpages;
+    unsigned long *bitmap;
 };
 
+static void virtio_balloon_pbp_free(PartiallyBalloonedPage *pbp)
+{
+    if (!pbp) {
+        return;
+    }
+    g_free(pbp->bitmap);
+    g_free(pbp);
+}
+
+static PartiallyBalloonedPage *virtio_balloon_pbp_alloc(ram_addr_t base_gpa,
+                                                        long subpages)
+{
+    PartiallyBalloonedPage *pbp = g_new0(PartiallyBalloonedPage, 1);
+
+    pbp->base_gpa = base_gpa;
+    pbp->subpages = subpages;
+    pbp->bitmap = bitmap_new(subpages);
+
+    return pbp;
+}
+
+static bool virtio_balloon_pbp_matches(PartiallyBalloonedPage *pbp,
+                                       ram_addr_t base_gpa, long subpages)
+{
+    return pbp->subpages == subpages && pbp->base_gpa == base_gpa;
+}
+
 static void balloon_inflate_page(VirtIOBalloon *balloon,
                                  MemoryRegion *mr, hwaddr mr_offset)
 {
     void *addr = memory_region_get_ram_ptr(mr) + mr_offset;
-    ram_addr_t rb_offset, rb_aligned_offset;
+    ram_addr_t rb_offset, rb_aligned_offset, base_gpa;
+    PartiallyBalloonedPage **pbp = &balloon->pbp;
     RAMBlock *rb;
     size_t rb_page_size;
     int subpages;
@@ -75,39 +103,34 @@ static void balloon_inflate_page(VirtIOBalloon *balloon,
 
     rb_aligned_offset = QEMU_ALIGN_DOWN(rb_offset, rb_page_size);
     subpages = rb_page_size / BALLOON_PAGE_SIZE;
+    base_gpa = memory_region_get_ram_addr(mr) + mr_offset -
+               (rb_offset - rb_aligned_offset);
 
-    if (balloon->pbp
-        && (rb != balloon->pbp->rb
-            || rb_aligned_offset != balloon->pbp->base)) {
+    if (*pbp && !virtio_balloon_pbp_matches(*pbp, base_gpa, subpages)) {
         /* We've partially ballooned part of a host page, but now
          * we're trying to balloon part of a different one.  Too hard,
          * give up on the old partial page */
-        g_free(balloon->pbp);
-        balloon->pbp = NULL;
+        virtio_balloon_pbp_free(*pbp);
+        *pbp = NULL;
     }
 
-    if (!balloon->pbp) {
-        /* Starting on a new host page */
-        size_t bitlen = BITS_TO_LONGS(subpages) * sizeof(unsigned long);
-        balloon->pbp = g_malloc0(sizeof(PartiallyBalloonedPage) + bitlen);
-        balloon->pbp->rb = rb;
-        balloon->pbp->base = rb_aligned_offset;
+    if (!*pbp) {
+        *pbp = virtio_balloon_pbp_alloc(base_gpa, subpages);
     }
 
-    set_bit((rb_offset - balloon->pbp->base) / BALLOON_PAGE_SIZE,
-            balloon->pbp->bitmap);
+    set_bit((rb_offset - rb_aligned_offset) / BALLOON_PAGE_SIZE,
+            (*pbp)->bitmap);
 
-    if (bitmap_full(balloon->pbp->bitmap, subpages)) {
+    if (bitmap_full((*pbp)->bitmap, subpages)) {
         /* We've accumulated a full host page, we can actually discard
          * it now */
 
-        ram_block_discard_range(rb, balloon->pbp->base, rb_page_size);
+        ram_block_discard_range(rb, rb_aligned_offset, rb_page_size);
         /* We ignore errors from ram_block_discard_range(), because it
          * has already reported them, and failing to discard a balloon
          * page is not fatal */
-
-        g_free(balloon->pbp);
-        balloon->pbp = NULL;
+        virtio_balloon_pbp_free(*pbp);
+        *pbp = NULL;
     }
 }
 
@@ -128,7 +151,7 @@ static void balloon_deflate_page(VirtIOBalloon *balloon,
 
     if (balloon->pbp) {
         /* Let's play safe and always reset the pbp on deflation requests. */
-        g_free(balloon->pbp);
+        virtio_balloon_pbp_free(balloon->pbp);
         balloon->pbp = NULL;
     }
 
-- 
2.21.0



^ permalink raw reply related	[flat|nested] 53+ messages in thread

* [Qemu-devel] [PATCH-for-4.1 v3 6/6] virtio-balloon: Use temporary PBP only
  2019-07-22 13:41 [Qemu-devel] [PATCH-for-4.1 v3 0/6] virtio-balloon: fixes David Hildenbrand
                   ` (4 preceding siblings ...)
  2019-07-22 13:41 ` [Qemu-devel] [PATCH-for-4.1 v3 5/6] virtio-balloon: Rework pbp tracking data David Hildenbrand
@ 2019-07-22 13:41 ` David Hildenbrand
  2019-07-23  3:22   ` David Gibson
  2019-07-25 15:32   ` [Qemu-devel] [PULL 10/12] " Michael S. Tsirkin
  5 siblings, 2 replies; 53+ messages in thread
From: David Hildenbrand @ 2019-07-22 13:41 UTC (permalink / raw)
  To: qemu-devel
  Cc: Michael S . Tsirkin, David Hildenbrand, qemu-stable,
	Stefan Hajnoczi, Igor Mammedov, David Gibson

We still have multiple issues in the current code
- The PBP is not freed during unrealize()
- The PBP is not reset on device resets: After a reset, the PBP is stale.
- We are not indicating VIRTIO_BALLOON_F_MUST_TELL_HOST, therefore
  guests (esp. legacy guests) will reuse pages without deflating,
  turning the PBP stale. Adding that would require compat handling.

Instead, let's use the PBP only temporarily, when processing one bulk of
inflation requests. This will keep guest_page_size > 4k working (with
Linux guests). There is nothing to do for deflation requests anymore.
The pbp is only used for a limited amount of time.

Fixes: ed48c59875b6 ("virtio-balloon: Safely handle BALLOON_PAGE_SIZE <
                     host page size")
Cc: qemu-stable@nongnu.org #v4.0.0
Suggested-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: David Hildenbrand <david@redhat.com>
---
 hw/virtio/virtio-balloon.c         | 21 +++++++++------------
 include/hw/virtio/virtio-balloon.h |  3 ---
 2 files changed, 9 insertions(+), 15 deletions(-)

diff --git a/hw/virtio/virtio-balloon.c b/hw/virtio/virtio-balloon.c
index 40d493a31a..a6282d58d4 100644
--- a/hw/virtio/virtio-balloon.c
+++ b/hw/virtio/virtio-balloon.c
@@ -34,11 +34,11 @@
 
 #define BALLOON_PAGE_SIZE  (1 << VIRTIO_BALLOON_PFN_SHIFT)
 
-struct PartiallyBalloonedPage {
+typedef struct PartiallyBalloonedPage {
     ram_addr_t base_gpa;
     long subpages;
     unsigned long *bitmap;
-};
+} PartiallyBalloonedPage;
 
 static void virtio_balloon_pbp_free(PartiallyBalloonedPage *pbp)
 {
@@ -68,11 +68,11 @@ static bool virtio_balloon_pbp_matches(PartiallyBalloonedPage *pbp,
 }
 
 static void balloon_inflate_page(VirtIOBalloon *balloon,
-                                 MemoryRegion *mr, hwaddr mr_offset)
+                                 MemoryRegion *mr, hwaddr mr_offset,
+                                 PartiallyBalloonedPage **pbp)
 {
     void *addr = memory_region_get_ram_ptr(mr) + mr_offset;
     ram_addr_t rb_offset, rb_aligned_offset, base_gpa;
-    PartiallyBalloonedPage **pbp = &balloon->pbp;
     RAMBlock *rb;
     size_t rb_page_size;
     int subpages;
@@ -149,12 +149,6 @@ static void balloon_deflate_page(VirtIOBalloon *balloon,
     rb = qemu_ram_block_from_host(addr, false, &rb_offset);
     rb_page_size = qemu_ram_pagesize(rb);
 
-    if (balloon->pbp) {
-        /* Let's play safe and always reset the pbp on deflation requests. */
-        virtio_balloon_pbp_free(balloon->pbp);
-        balloon->pbp = NULL;
-    }
-
     host_addr = (void *)((uintptr_t)addr & ~(rb_page_size - 1));
 
     /* When a page is deflated, we hint the whole host page it lives
@@ -336,6 +330,7 @@ static void balloon_stats_set_poll_interval(Object *obj, Visitor *v,
 static void virtio_balloon_handle_output(VirtIODevice *vdev, VirtQueue *vq)
 {
     VirtIOBalloon *s = VIRTIO_BALLOON(vdev);
+    PartiallyBalloonedPage *pbp = NULL;
     VirtQueueElement *elem;
     MemoryRegionSection section;
 
@@ -344,7 +339,7 @@ static void virtio_balloon_handle_output(VirtIODevice *vdev, VirtQueue *vq)
         uint32_t pfn;
         elem = virtqueue_pop(vq, sizeof(VirtQueueElement));
         if (!elem) {
-            return;
+            break;
         }
 
         while (iov_to_buf(elem->out_sg, elem->out_num, offset, &pfn, 4) == 4) {
@@ -373,7 +368,7 @@ static void virtio_balloon_handle_output(VirtIODevice *vdev, VirtQueue *vq)
             if (!qemu_balloon_is_inhibited()) {
                 if (vq == s->ivq) {
                     balloon_inflate_page(s, section.mr,
-                                         section.offset_within_region);
+                                         section.offset_within_region, &pbp);
                 } else if (vq == s->dvq) {
                     balloon_deflate_page(s, section.mr, section.offset_within_region);
                 } else {
@@ -387,6 +382,8 @@ static void virtio_balloon_handle_output(VirtIODevice *vdev, VirtQueue *vq)
         virtio_notify(vdev, vq);
         g_free(elem);
     }
+
+    virtio_balloon_pbp_free(pbp);
 }
 
 static void virtio_balloon_receive_stats(VirtIODevice *vdev, VirtQueue *vq)
diff --git a/include/hw/virtio/virtio-balloon.h b/include/hw/virtio/virtio-balloon.h
index 5a99293a45..d1c968d237 100644
--- a/include/hw/virtio/virtio-balloon.h
+++ b/include/hw/virtio/virtio-balloon.h
@@ -33,8 +33,6 @@ typedef struct virtio_balloon_stat_modern {
        uint64_t val;
 } VirtIOBalloonStatModern;
 
-typedef struct PartiallyBalloonedPage PartiallyBalloonedPage;
-
 enum virtio_balloon_free_page_report_status {
     FREE_PAGE_REPORT_S_STOP = 0,
     FREE_PAGE_REPORT_S_REQUESTED = 1,
@@ -70,7 +68,6 @@ typedef struct VirtIOBalloon {
     int64_t stats_last_update;
     int64_t stats_poll_interval;
     uint32_t host_features;
-    PartiallyBalloonedPage *pbp;
 
     bool qemu_4_0_config_size;
 } VirtIOBalloon;
-- 
2.21.0



^ permalink raw reply related	[flat|nested] 53+ messages in thread

* Re: [Qemu-devel] [PATCH-for-4.1 v3 1/6] virtio-balloon: Fix wrong sign extension of PFNs
  2019-07-22 13:41 ` [Qemu-devel] [PATCH-for-4.1 v3 1/6] virtio-balloon: Fix wrong sign extension of PFNs David Hildenbrand
@ 2019-07-23  2:27   ` David Gibson
  2019-07-25 15:31   ` [Qemu-devel] [PULL 05/12] " Michael S. Tsirkin
  1 sibling, 0 replies; 53+ messages in thread
From: David Gibson @ 2019-07-23  2:27 UTC (permalink / raw)
  To: David Hildenbrand
  Cc: Igor Mammedov, qemu-stable, qemu-devel, Stefan Hajnoczi,
	Michael S . Tsirkin

[-- Attachment #1: Type: text/plain, Size: 1492 bytes --]

On Mon, Jul 22, 2019 at 03:41:03PM +0200, David Hildenbrand wrote:
> If we directly cast from int to uint64_t, we will first sign-extend to
> an int64_t, which is wrong. We actually want to treat the PFNs like
> unsigned values.
> 
> As far as I can see, this dates back to the initial virtio-balloon
> commit, but wasn't triggered as fairly big guests would be required.
> 
> Cc: qemu-stable@nongnu.org
> Reported-by: Michael S. Tsirkin <mst@redhat.com>
> Signed-off-by: David Hildenbrand <david@redhat.com>

Reviewed-by: David Gibson <david@gibson.dropbear.id.au>

> ---
>  hw/virtio/virtio-balloon.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/hw/virtio/virtio-balloon.c b/hw/virtio/virtio-balloon.c
> index e85d1c0d5c..515abf6553 100644
> --- a/hw/virtio/virtio-balloon.c
> +++ b/hw/virtio/virtio-balloon.c
> @@ -343,8 +343,8 @@ static void virtio_balloon_handle_output(VirtIODevice *vdev, VirtQueue *vq)
>          }
>  
>          while (iov_to_buf(elem->out_sg, elem->out_num, offset, &pfn, 4) == 4) {
> +            unsigned int p = virtio_ldl_p(vdev, &pfn);
>              hwaddr pa;
> -            int p = virtio_ldl_p(vdev, &pfn);
>  
>              pa = (hwaddr) p << VIRTIO_BALLOON_PFN_SHIFT;
>              offset += 4;

-- 
David Gibson			| I'll have my music baroque, and my code
david AT gibson.dropbear.id.au	| minimalist, thank you.  NOT _the_ _other_
				| _way_ _around_!
http://www.ozlabs.org/~dgibson

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: [Qemu-devel] [PATCH-for-4.1 v3 5/6] virtio-balloon: Rework pbp tracking data
  2019-07-22 13:41 ` [Qemu-devel] [PATCH-for-4.1 v3 5/6] virtio-balloon: Rework pbp tracking data David Hildenbrand
@ 2019-07-23  2:54   ` David Gibson
  2019-07-23  7:38     ` David Hildenbrand
  2019-07-25 15:32   ` [Qemu-devel] [PULL 09/12] " Michael S. Tsirkin
  1 sibling, 1 reply; 53+ messages in thread
From: David Gibson @ 2019-07-23  2:54 UTC (permalink / raw)
  To: David Hildenbrand
  Cc: Igor Mammedov, qemu-stable, qemu-devel, Stefan Hajnoczi,
	Michael S . Tsirkin

[-- Attachment #1: Type: text/plain, Size: 5675 bytes --]

On Mon, Jul 22, 2019 at 03:41:07PM +0200, David Hildenbrand wrote:
> Using the address of a RAMBlock to test for a matching pbp is not really
> safe. Instead, let's use the guest physical address of the base page
> along with the page size (via the number of subpages).
> 
> Also, let's allocate the bitmap separately. This makes the code
> easier to read and maintain - we can reuse bitmap_new().
> 
> Prepare the code to move the PBP out of the device.
> 
> Fixes: ed48c59875b6 ("virtio-balloon: Safely handle BALLOON_PAGE_SIZE <
>                      host page size")
> Fixes: b27b32391404 ("virtio-balloon: Fix possible guest memory corruption
>                      with inflates & deflates")
> Cc: qemu-stable@nongnu.org #v4.0.0
> Signed-off-by: David Hildenbrand <david@redhat.com>
> ---
>  hw/virtio/virtio-balloon.c | 69 +++++++++++++++++++++++++-------------
>  1 file changed, 46 insertions(+), 23 deletions(-)
> 
> diff --git a/hw/virtio/virtio-balloon.c b/hw/virtio/virtio-balloon.c
> index f206cc8bf7..40d493a31a 100644
> --- a/hw/virtio/virtio-balloon.c
> +++ b/hw/virtio/virtio-balloon.c
> @@ -35,16 +35,44 @@
>  #define BALLOON_PAGE_SIZE  (1 << VIRTIO_BALLOON_PFN_SHIFT)
>  
>  struct PartiallyBalloonedPage {
> -    RAMBlock *rb;
> -    ram_addr_t base;
> -    unsigned long bitmap[];
> +    ram_addr_t base_gpa;
> +    long subpages;
> +    unsigned long *bitmap;
>  };
>  
> +static void virtio_balloon_pbp_free(PartiallyBalloonedPage *pbp)
> +{
> +    if (!pbp) {
> +        return;
> +    }
> +    g_free(pbp->bitmap);
> +    g_free(pbp);
> +}
> +
> +static PartiallyBalloonedPage *virtio_balloon_pbp_alloc(ram_addr_t base_gpa,
> +                                                        long subpages)
> +{
> +    PartiallyBalloonedPage *pbp = g_new0(PartiallyBalloonedPage, 1);
> +
> +    pbp->base_gpa = base_gpa;
> +    pbp->subpages = subpages;
> +    pbp->bitmap = bitmap_new(subpages);
> +
> +    return pbp;
> +}
> +
> +static bool virtio_balloon_pbp_matches(PartiallyBalloonedPage *pbp,
> +                                       ram_addr_t base_gpa, long subpages)
> +{
> +    return pbp->subpages == subpages && pbp->base_gpa == base_gpa;

I think the test on subpages is pointless here.  You've (reasonably)
rejected handling the edge case of a ramblock being removed and
replugged - but the only thing this handles is an edge case of that
edge case.


Otherwise, LGTM.

> +}
> +
>  static void balloon_inflate_page(VirtIOBalloon *balloon,
>                                   MemoryRegion *mr, hwaddr mr_offset)
>  {
>      void *addr = memory_region_get_ram_ptr(mr) + mr_offset;
> -    ram_addr_t rb_offset, rb_aligned_offset;
> +    ram_addr_t rb_offset, rb_aligned_offset, base_gpa;
> +    PartiallyBalloonedPage **pbp = &balloon->pbp;
>      RAMBlock *rb;
>      size_t rb_page_size;
>      int subpages;
> @@ -75,39 +103,34 @@ static void balloon_inflate_page(VirtIOBalloon *balloon,
>  
>      rb_aligned_offset = QEMU_ALIGN_DOWN(rb_offset, rb_page_size);
>      subpages = rb_page_size / BALLOON_PAGE_SIZE;
> +    base_gpa = memory_region_get_ram_addr(mr) + mr_offset -
> +               (rb_offset - rb_aligned_offset);
>  
> -    if (balloon->pbp
> -        && (rb != balloon->pbp->rb
> -            || rb_aligned_offset != balloon->pbp->base)) {
> +    if (*pbp && !virtio_balloon_pbp_matches(*pbp, base_gpa, subpages)) {
>          /* We've partially ballooned part of a host page, but now
>           * we're trying to balloon part of a different one.  Too hard,
>           * give up on the old partial page */
> -        g_free(balloon->pbp);
> -        balloon->pbp = NULL;
> +        virtio_balloon_pbp_free(*pbp);
> +        *pbp = NULL;
>      }
>  
> -    if (!balloon->pbp) {
> -        /* Starting on a new host page */
> -        size_t bitlen = BITS_TO_LONGS(subpages) * sizeof(unsigned long);
> -        balloon->pbp = g_malloc0(sizeof(PartiallyBalloonedPage) + bitlen);
> -        balloon->pbp->rb = rb;
> -        balloon->pbp->base = rb_aligned_offset;
> +    if (!*pbp) {
> +        *pbp = virtio_balloon_pbp_alloc(base_gpa, subpages);
>      }
>  
> -    set_bit((rb_offset - balloon->pbp->base) / BALLOON_PAGE_SIZE,
> -            balloon->pbp->bitmap);
> +    set_bit((rb_offset - rb_aligned_offset) / BALLOON_PAGE_SIZE,
> +            (*pbp)->bitmap);
>  
> -    if (bitmap_full(balloon->pbp->bitmap, subpages)) {
> +    if (bitmap_full((*pbp)->bitmap, subpages)) {
>          /* We've accumulated a full host page, we can actually discard
>           * it now */
>  
> -        ram_block_discard_range(rb, balloon->pbp->base, rb_page_size);
> +        ram_block_discard_range(rb, rb_aligned_offset, rb_page_size);
>          /* We ignore errors from ram_block_discard_range(), because it
>           * has already reported them, and failing to discard a balloon
>           * page is not fatal */
> -
> -        g_free(balloon->pbp);
> -        balloon->pbp = NULL;
> +        virtio_balloon_pbp_free(*pbp);
> +        *pbp = NULL;
>      }
>  }
>  
> @@ -128,7 +151,7 @@ static void balloon_deflate_page(VirtIOBalloon *balloon,
>  
>      if (balloon->pbp) {
>          /* Let's play safe and always reset the pbp on deflation requests. */
> -        g_free(balloon->pbp);
> +        virtio_balloon_pbp_free(balloon->pbp);
>          balloon->pbp = NULL;
>      }
>  

-- 
David Gibson			| I'll have my music baroque, and my code
david AT gibson.dropbear.id.au	| minimalist, thank you.  NOT _the_ _other_
				| _way_ _around_!
http://www.ozlabs.org/~dgibson

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: [Qemu-devel] [PATCH-for-4.1 v3 6/6] virtio-balloon: Use temporary PBP only
  2019-07-22 13:41 ` [Qemu-devel] [PATCH-for-4.1 v3 6/6] virtio-balloon: Use temporary PBP only David Hildenbrand
@ 2019-07-23  3:22   ` David Gibson
  2019-07-25 15:32   ` [Qemu-devel] [PULL 10/12] " Michael S. Tsirkin
  1 sibling, 0 replies; 53+ messages in thread
From: David Gibson @ 2019-07-23  3:22 UTC (permalink / raw)
  To: David Hildenbrand
  Cc: Igor Mammedov, qemu-stable, qemu-devel, Stefan Hajnoczi,
	Michael S . Tsirkin

[-- Attachment #1: Type: text/plain, Size: 5581 bytes --]

On Mon, Jul 22, 2019 at 03:41:08PM +0200, David Hildenbrand wrote:
> We still have multiple issues in the current code
> - The PBP is not freed during unrealize()
> - The PBP is not reset on device resets: After a reset, the PBP is stale.
> - We are not indicating VIRTIO_BALLOON_F_MUST_TELL_HOST, therefore
>   guests (esp. legacy guests) will reuse pages without deflating,
>   turning the PBP stale. Adding that would require compat handling.
> 
> Instead, let's use the PBP only temporarily, when processing one bulk of
> inflation requests. This will keep guest_page_size > 4k working (with
> Linux guests). There is nothing to do for deflation requests anymore.
> The pbp is only used for a limited amount of time.
> 
> Fixes: ed48c59875b6 ("virtio-balloon: Safely handle BALLOON_PAGE_SIZE <
>                      host page size")
> Cc: qemu-stable@nongnu.org #v4.0.0
> Suggested-by: Michael S. Tsirkin <mst@redhat.com>
> Signed-off-by: David Hildenbrand <david@redhat.com>

Ah, nice idea.  

Acked-by: David Gibson <david@gibson.dropbear.id.au>

> ---
>  hw/virtio/virtio-balloon.c         | 21 +++++++++------------
>  include/hw/virtio/virtio-balloon.h |  3 ---
>  2 files changed, 9 insertions(+), 15 deletions(-)
> 
> diff --git a/hw/virtio/virtio-balloon.c b/hw/virtio/virtio-balloon.c
> index 40d493a31a..a6282d58d4 100644
> --- a/hw/virtio/virtio-balloon.c
> +++ b/hw/virtio/virtio-balloon.c
> @@ -34,11 +34,11 @@
>  
>  #define BALLOON_PAGE_SIZE  (1 << VIRTIO_BALLOON_PFN_SHIFT)
>  
> -struct PartiallyBalloonedPage {
> +typedef struct PartiallyBalloonedPage {
>      ram_addr_t base_gpa;
>      long subpages;
>      unsigned long *bitmap;
> -};
> +} PartiallyBalloonedPage;
>  
>  static void virtio_balloon_pbp_free(PartiallyBalloonedPage *pbp)
>  {
> @@ -68,11 +68,11 @@ static bool virtio_balloon_pbp_matches(PartiallyBalloonedPage *pbp,
>  }
>  
>  static void balloon_inflate_page(VirtIOBalloon *balloon,
> -                                 MemoryRegion *mr, hwaddr mr_offset)
> +                                 MemoryRegion *mr, hwaddr mr_offset,
> +                                 PartiallyBalloonedPage **pbp)
>  {
>      void *addr = memory_region_get_ram_ptr(mr) + mr_offset;
>      ram_addr_t rb_offset, rb_aligned_offset, base_gpa;
> -    PartiallyBalloonedPage **pbp = &balloon->pbp;
>      RAMBlock *rb;
>      size_t rb_page_size;
>      int subpages;
> @@ -149,12 +149,6 @@ static void balloon_deflate_page(VirtIOBalloon *balloon,
>      rb = qemu_ram_block_from_host(addr, false, &rb_offset);
>      rb_page_size = qemu_ram_pagesize(rb);
>  
> -    if (balloon->pbp) {
> -        /* Let's play safe and always reset the pbp on deflation requests. */
> -        virtio_balloon_pbp_free(balloon->pbp);
> -        balloon->pbp = NULL;
> -    }
> -
>      host_addr = (void *)((uintptr_t)addr & ~(rb_page_size - 1));
>  
>      /* When a page is deflated, we hint the whole host page it lives
> @@ -336,6 +330,7 @@ static void balloon_stats_set_poll_interval(Object *obj, Visitor *v,
>  static void virtio_balloon_handle_output(VirtIODevice *vdev, VirtQueue *vq)
>  {
>      VirtIOBalloon *s = VIRTIO_BALLOON(vdev);
> +    PartiallyBalloonedPage *pbp = NULL;
>      VirtQueueElement *elem;
>      MemoryRegionSection section;
>  
> @@ -344,7 +339,7 @@ static void virtio_balloon_handle_output(VirtIODevice *vdev, VirtQueue *vq)
>          uint32_t pfn;
>          elem = virtqueue_pop(vq, sizeof(VirtQueueElement));
>          if (!elem) {
> -            return;
> +            break;
>          }
>  
>          while (iov_to_buf(elem->out_sg, elem->out_num, offset, &pfn, 4) == 4) {
> @@ -373,7 +368,7 @@ static void virtio_balloon_handle_output(VirtIODevice *vdev, VirtQueue *vq)
>              if (!qemu_balloon_is_inhibited()) {
>                  if (vq == s->ivq) {
>                      balloon_inflate_page(s, section.mr,
> -                                         section.offset_within_region);
> +                                         section.offset_within_region, &pbp);
>                  } else if (vq == s->dvq) {
>                      balloon_deflate_page(s, section.mr, section.offset_within_region);
>                  } else {
> @@ -387,6 +382,8 @@ static void virtio_balloon_handle_output(VirtIODevice *vdev, VirtQueue *vq)
>          virtio_notify(vdev, vq);
>          g_free(elem);
>      }
> +
> +    virtio_balloon_pbp_free(pbp);
>  }
>  
>  static void virtio_balloon_receive_stats(VirtIODevice *vdev, VirtQueue *vq)
> diff --git a/include/hw/virtio/virtio-balloon.h b/include/hw/virtio/virtio-balloon.h
> index 5a99293a45..d1c968d237 100644
> --- a/include/hw/virtio/virtio-balloon.h
> +++ b/include/hw/virtio/virtio-balloon.h
> @@ -33,8 +33,6 @@ typedef struct virtio_balloon_stat_modern {
>         uint64_t val;
>  } VirtIOBalloonStatModern;
>  
> -typedef struct PartiallyBalloonedPage PartiallyBalloonedPage;
> -
>  enum virtio_balloon_free_page_report_status {
>      FREE_PAGE_REPORT_S_STOP = 0,
>      FREE_PAGE_REPORT_S_REQUESTED = 1,
> @@ -70,7 +68,6 @@ typedef struct VirtIOBalloon {
>      int64_t stats_last_update;
>      int64_t stats_poll_interval;
>      uint32_t host_features;
> -    PartiallyBalloonedPage *pbp;
>  
>      bool qemu_4_0_config_size;
>  } VirtIOBalloon;

-- 
David Gibson			| I'll have my music baroque, and my code
david AT gibson.dropbear.id.au	| minimalist, thank you.  NOT _the_ _other_
				| _way_ _around_!
http://www.ozlabs.org/~dgibson

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: [Qemu-devel] [PATCH-for-4.1 v3 5/6] virtio-balloon: Rework pbp tracking data
  2019-07-23  2:54   ` David Gibson
@ 2019-07-23  7:38     ` David Hildenbrand
  0 siblings, 0 replies; 53+ messages in thread
From: David Hildenbrand @ 2019-07-23  7:38 UTC (permalink / raw)
  To: David Gibson
  Cc: Igor Mammedov, qemu-stable, qemu-devel, Stefan Hajnoczi,
	Michael S . Tsirkin

On 23.07.19 04:54, David Gibson wrote:
> On Mon, Jul 22, 2019 at 03:41:07PM +0200, David Hildenbrand wrote:
>> Using the address of a RAMBlock to test for a matching pbp is not really
>> safe. Instead, let's use the guest physical address of the base page
>> along with the page size (via the number of subpages).
>>
>> Also, let's allocate the bitmap separately. This makes the code
>> easier to read and maintain - we can reuse bitmap_new().
>>
>> Prepare the code to move the PBP out of the device.
>>
>> Fixes: ed48c59875b6 ("virtio-balloon: Safely handle BALLOON_PAGE_SIZE <
>>                      host page size")
>> Fixes: b27b32391404 ("virtio-balloon: Fix possible guest memory corruption
>>                      with inflates & deflates")
>> Cc: qemu-stable@nongnu.org #v4.0.0
>> Signed-off-by: David Hildenbrand <david@redhat.com>
>> ---
>>  hw/virtio/virtio-balloon.c | 69 +++++++++++++++++++++++++-------------
>>  1 file changed, 46 insertions(+), 23 deletions(-)
>>
>> diff --git a/hw/virtio/virtio-balloon.c b/hw/virtio/virtio-balloon.c
>> index f206cc8bf7..40d493a31a 100644
>> --- a/hw/virtio/virtio-balloon.c
>> +++ b/hw/virtio/virtio-balloon.c
>> @@ -35,16 +35,44 @@
>>  #define BALLOON_PAGE_SIZE  (1 << VIRTIO_BALLOON_PFN_SHIFT)
>>  
>>  struct PartiallyBalloonedPage {
>> -    RAMBlock *rb;
>> -    ram_addr_t base;
>> -    unsigned long bitmap[];
>> +    ram_addr_t base_gpa;
>> +    long subpages;
>> +    unsigned long *bitmap;
>>  };
>>  
>> +static void virtio_balloon_pbp_free(PartiallyBalloonedPage *pbp)
>> +{
>> +    if (!pbp) {
>> +        return;
>> +    }
>> +    g_free(pbp->bitmap);
>> +    g_free(pbp);
>> +}
>> +
>> +static PartiallyBalloonedPage *virtio_balloon_pbp_alloc(ram_addr_t base_gpa,
>> +                                                        long subpages)
>> +{
>> +    PartiallyBalloonedPage *pbp = g_new0(PartiallyBalloonedPage, 1);
>> +
>> +    pbp->base_gpa = base_gpa;
>> +    pbp->subpages = subpages;
>> +    pbp->bitmap = bitmap_new(subpages);
>> +
>> +    return pbp;
>> +}
>> +
>> +static bool virtio_balloon_pbp_matches(PartiallyBalloonedPage *pbp,
>> +                                       ram_addr_t base_gpa, long subpages)
>> +{
>> +    return pbp->subpages == subpages && pbp->base_gpa == base_gpa;
> 
> I think the test on subpages is pointless here.  You've (reasonably)
> rejected handling the edge case of a ramblock being removed and
> replugged - but the only thing this handles is an edge case of that
> edge case.

I think we have to leave it in place until patch #6. We could remove it
after Patch #6 - replug cannot happen while processing the inflation
requests.

-- 

Thanks,

David / dhildenb


^ permalink raw reply	[flat|nested] 53+ messages in thread

* [Qemu-devel] [PATCH-for-4.1 v4 0/7] virtio-balloon: fixes
@ 2019-07-25 11:36 David Hildenbrand
  2019-07-25 11:36 ` [Qemu-devel] [PATCH-for-4.1 v4 1/7] virtio-balloon: Fix wrong sign extension of PFNs David Hildenbrand
                   ` (6 more replies)
  0 siblings, 7 replies; 53+ messages in thread
From: David Hildenbrand @ 2019-07-25 11:36 UTC (permalink / raw)
  To: qemu-devel
  Cc: Michael S . Tsirkin, David Hildenbrand, qemu-stable,
	Stefan Hajnoczi, Igor Mammedov, David Gibson

As we discovered yet another issue with current PBP code, we decided
to use a temporary PBP only, limited to one series of inflation requests.

Patch #1 is a fix for a wrong sign extension (MST brought this up but
wasn't sure if it is broken - I think it is indeed broken). Patch #2 fixed
QEMU segfaults. Patch #3 and #4 are cleanups that make follow-up fixes
easier. Patch #5 avoids using RAMBlock addresses as tokens and patch #6
fixes all kinds of issues related to using a global PBP. Patch #7 gets
rid of a temporary workaround from patch #5.

We want to have patches 1-6 in 4.1 and backport them to 4.0. Patch #1
needs backports to basically all QEMU releases with virtio-balloon.

Did a quick sanity test, hopefully no other BUG sneeked in. Will do some
more testing.

v3 -> v4:
- Add "virtio-balloon: No need to track subpages for the PBP anymore",
  which doesn't need a stable backport

Cc: Stefan Hajnoczi <stefanha@redhat.com>
Cc: David Gibson <david@gibson.dropbear.id.au>
Cc: Michael S. Tsirkin <mst@redhat.com>
Cc: Igor Mammedov <imammedo@redhat.com>
Cc: qemu-stable@nongnu.org

David Hildenbrand (7):
  virtio-balloon: Fix wrong sign extension of PFNs
  virtio-balloon: Fix QEMU crashes on pagesize > BALLOON_PAGE_SIZE
  virtio-balloon: Simplify deflate with pbp
  virtio-balloon: Better names for offset variables in inflate/deflate
    code
  virtio-balloon: Rework pbp tracking data
  virtio-balloon: Use temporary PBP only
  virtio-balloon: No need to track subpages for the PBP anymore

 hw/virtio/virtio-balloon.c         | 120 ++++++++++++++---------------
 include/hw/virtio/virtio-balloon.h |   3 -
 2 files changed, 60 insertions(+), 63 deletions(-)

-- 
2.21.0



^ permalink raw reply	[flat|nested] 53+ messages in thread

* [Qemu-devel] [PATCH-for-4.1 v4 1/7] virtio-balloon: Fix wrong sign extension of PFNs
  2019-07-25 11:36 [Qemu-devel] [PATCH-for-4.1 v4 0/7] virtio-balloon: fixes David Hildenbrand
@ 2019-07-25 11:36 ` David Hildenbrand
  2019-07-25 12:36   ` Pankaj Gupta
  2019-07-25 11:36 ` [Qemu-devel] [PATCH-for-4.1 v4 2/7] virtio-balloon: Fix QEMU crashes on pagesize > BALLOON_PAGE_SIZE David Hildenbrand
                   ` (5 subsequent siblings)
  6 siblings, 1 reply; 53+ messages in thread
From: David Hildenbrand @ 2019-07-25 11:36 UTC (permalink / raw)
  To: qemu-devel
  Cc: Michael S . Tsirkin, David Hildenbrand, qemu-stable,
	Stefan Hajnoczi, Igor Mammedov, David Gibson

If we directly cast from int to uint64_t, we will first sign-extend to
an int64_t, which is wrong. We actually want to treat the PFNs like
unsigned values.

As far as I can see, this dates back to the initial virtio-balloon
commit, but wasn't triggered as fairly big guests would be required.

Cc: qemu-stable@nongnu.org
Reported-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: David Hildenbrand <david@redhat.com>
---
 hw/virtio/virtio-balloon.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/hw/virtio/virtio-balloon.c b/hw/virtio/virtio-balloon.c
index e85d1c0d5c..515abf6553 100644
--- a/hw/virtio/virtio-balloon.c
+++ b/hw/virtio/virtio-balloon.c
@@ -343,8 +343,8 @@ static void virtio_balloon_handle_output(VirtIODevice *vdev, VirtQueue *vq)
         }
 
         while (iov_to_buf(elem->out_sg, elem->out_num, offset, &pfn, 4) == 4) {
+            unsigned int p = virtio_ldl_p(vdev, &pfn);
             hwaddr pa;
-            int p = virtio_ldl_p(vdev, &pfn);
 
             pa = (hwaddr) p << VIRTIO_BALLOON_PFN_SHIFT;
             offset += 4;
-- 
2.21.0



^ permalink raw reply related	[flat|nested] 53+ messages in thread

* [Qemu-devel] [PATCH-for-4.1 v4 2/7] virtio-balloon: Fix QEMU crashes on pagesize > BALLOON_PAGE_SIZE
  2019-07-25 11:36 [Qemu-devel] [PATCH-for-4.1 v4 0/7] virtio-balloon: fixes David Hildenbrand
  2019-07-25 11:36 ` [Qemu-devel] [PATCH-for-4.1 v4 1/7] virtio-balloon: Fix wrong sign extension of PFNs David Hildenbrand
@ 2019-07-25 11:36 ` David Hildenbrand
  2019-07-25 11:36 ` [Qemu-devel] [PATCH-for-4.1 v4 3/7] virtio-balloon: Simplify deflate with pbp David Hildenbrand
                   ` (4 subsequent siblings)
  6 siblings, 0 replies; 53+ messages in thread
From: David Hildenbrand @ 2019-07-25 11:36 UTC (permalink / raw)
  To: qemu-devel
  Cc: Michael S . Tsirkin, David Hildenbrand, qemu-stable,
	Stefan Hajnoczi, Igor Mammedov, David Gibson

We are using the wrong functions to set/clear bits, effectively touching
multiple bits, writing out of range of the bitmap, resulting in memory
corruptions. We have to use set_bit()/clear_bit() instead.

Can easily be reproduced by starting a qemu guest on hugetlbfs memory,
inflating the balloon. QEMU crashes. This never could have worked
properly - especially, also pages would have been discarded when the
first sub-page would be inflated (the whole bitmap would be set).

While testing I realized, that on hugetlbfs it is pretty much impossible
to discard a page - the guest just frees the 4k sub-pages in random order
most of the time. I was only able to discard a hugepage a handful of
times - so I hope that now works correctly.

Fixes: ed48c59875b6 ("virtio-balloon: Safely handle BALLOON_PAGE_SIZE <
                     host page size")
Fixes: b27b32391404 ("virtio-balloon: Fix possible guest memory corruption
                     with inflates & deflates")
Cc: qemu-stable@nongnu.org #v4.0.0
Acked-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: David Hildenbrand <david@redhat.com>
---
 hw/virtio/virtio-balloon.c | 10 ++++------
 1 file changed, 4 insertions(+), 6 deletions(-)

diff --git a/hw/virtio/virtio-balloon.c b/hw/virtio/virtio-balloon.c
index 515abf6553..a78d2d2184 100644
--- a/hw/virtio/virtio-balloon.c
+++ b/hw/virtio/virtio-balloon.c
@@ -94,9 +94,8 @@ static void balloon_inflate_page(VirtIOBalloon *balloon,
         balloon->pbp->base = host_page_base;
     }
 
-    bitmap_set(balloon->pbp->bitmap,
-               (ram_offset - balloon->pbp->base) / BALLOON_PAGE_SIZE,
-               subpages);
+    set_bit((ram_offset - balloon->pbp->base) / BALLOON_PAGE_SIZE,
+            balloon->pbp->bitmap);
 
     if (bitmap_full(balloon->pbp->bitmap, subpages)) {
         /* We've accumulated a full host page, we can actually discard
@@ -140,9 +139,8 @@ static void balloon_deflate_page(VirtIOBalloon *balloon,
          * for a guest to do this in practice, but handle it anyway,
          * since getting it wrong could mean discarding memory the
          * guest is still using. */
-        bitmap_clear(balloon->pbp->bitmap,
-                     (ram_offset - balloon->pbp->base) / BALLOON_PAGE_SIZE,
-                     subpages);
+        clear_bit((ram_offset - balloon->pbp->base) / BALLOON_PAGE_SIZE,
+                  balloon->pbp->bitmap);
 
         if (bitmap_empty(balloon->pbp->bitmap, subpages)) {
             g_free(balloon->pbp);
-- 
2.21.0



^ permalink raw reply related	[flat|nested] 53+ messages in thread

* [Qemu-devel] [PATCH-for-4.1 v4 3/7] virtio-balloon: Simplify deflate with pbp
  2019-07-25 11:36 [Qemu-devel] [PATCH-for-4.1 v4 0/7] virtio-balloon: fixes David Hildenbrand
  2019-07-25 11:36 ` [Qemu-devel] [PATCH-for-4.1 v4 1/7] virtio-balloon: Fix wrong sign extension of PFNs David Hildenbrand
  2019-07-25 11:36 ` [Qemu-devel] [PATCH-for-4.1 v4 2/7] virtio-balloon: Fix QEMU crashes on pagesize > BALLOON_PAGE_SIZE David Hildenbrand
@ 2019-07-25 11:36 ` David Hildenbrand
  2019-07-25 11:36 ` [Qemu-devel] [PATCH-for-4.1 v4 4/7] virtio-balloon: Better names for offset variables in inflate/deflate code David Hildenbrand
                   ` (3 subsequent siblings)
  6 siblings, 0 replies; 53+ messages in thread
From: David Hildenbrand @ 2019-07-25 11:36 UTC (permalink / raw)
  To: qemu-devel
  Cc: Michael S . Tsirkin, David Hildenbrand, qemu-stable,
	Stefan Hajnoczi, Igor Mammedov, David Gibson

Let's simplify this - the case we are optimizing for is very hard to
trigger and not worth the effort. If we're switching from inflation to
deflation, let's reset the pbp.

Acked-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: David Hildenbrand <david@redhat.com>
---
 hw/virtio/virtio-balloon.c | 26 +++++---------------------
 1 file changed, 5 insertions(+), 21 deletions(-)

diff --git a/hw/virtio/virtio-balloon.c b/hw/virtio/virtio-balloon.c
index a78d2d2184..04a7e6c772 100644
--- a/hw/virtio/virtio-balloon.c
+++ b/hw/virtio/virtio-balloon.c
@@ -117,7 +117,7 @@ static void balloon_deflate_page(VirtIOBalloon *balloon,
     void *addr = memory_region_get_ram_ptr(mr) + offset;
     RAMBlock *rb;
     size_t rb_page_size;
-    ram_addr_t ram_offset, host_page_base;
+    ram_addr_t ram_offset;
     void *host_addr;
     int ret;
 
@@ -125,27 +125,11 @@ static void balloon_deflate_page(VirtIOBalloon *balloon,
      * host address? */
     rb = qemu_ram_block_from_host(addr, false, &ram_offset);
     rb_page_size = qemu_ram_pagesize(rb);
-    host_page_base = ram_offset & ~(rb_page_size - 1);
-
-    if (balloon->pbp
-        && rb == balloon->pbp->rb
-        && host_page_base == balloon->pbp->base) {
-        int subpages = rb_page_size / BALLOON_PAGE_SIZE;
 
-        /*
-         * This means the guest has asked to discard some of the 4kiB
-         * subpages of a host page, but then changed its mind and
-         * asked to keep them after all.  It's exceedingly unlikely
-         * for a guest to do this in practice, but handle it anyway,
-         * since getting it wrong could mean discarding memory the
-         * guest is still using. */
-        clear_bit((ram_offset - balloon->pbp->base) / BALLOON_PAGE_SIZE,
-                  balloon->pbp->bitmap);
-
-        if (bitmap_empty(balloon->pbp->bitmap, subpages)) {
-            g_free(balloon->pbp);
-            balloon->pbp = NULL;
-        }
+    if (balloon->pbp) {
+        /* Let's play safe and always reset the pbp on deflation requests. */
+        g_free(balloon->pbp);
+        balloon->pbp = NULL;
     }
 
     host_addr = (void *)((uintptr_t)addr & ~(rb_page_size - 1));
-- 
2.21.0



^ permalink raw reply related	[flat|nested] 53+ messages in thread

* [Qemu-devel] [PATCH-for-4.1 v4 4/7] virtio-balloon: Better names for offset variables in inflate/deflate code
  2019-07-25 11:36 [Qemu-devel] [PATCH-for-4.1 v4 0/7] virtio-balloon: fixes David Hildenbrand
                   ` (2 preceding siblings ...)
  2019-07-25 11:36 ` [Qemu-devel] [PATCH-for-4.1 v4 3/7] virtio-balloon: Simplify deflate with pbp David Hildenbrand
@ 2019-07-25 11:36 ` David Hildenbrand
  2019-07-25 11:36 ` [Qemu-devel] [PATCH-for-4.1 v4 5/7] virtio-balloon: Rework pbp tracking data David Hildenbrand
                   ` (2 subsequent siblings)
  6 siblings, 0 replies; 53+ messages in thread
From: David Hildenbrand @ 2019-07-25 11:36 UTC (permalink / raw)
  To: qemu-devel
  Cc: Michael S . Tsirkin, David Hildenbrand, qemu-stable,
	Stefan Hajnoczi, Igor Mammedov, David Gibson

"host_page_base" is really confusing, let's make this clearer, also
rename the other offsets to indicate to which base they apply.

offset -> mr_offset
ram_offset -> rb_offset
host_page_base -> rb_aligned_offset

While at it, use QEMU_ALIGN_DOWN() instead of a handcrafted computation
and move the computation to the place where it is needed.

Acked-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: David Hildenbrand <david@redhat.com>
---
 hw/virtio/virtio-balloon.c | 26 +++++++++++++-------------
 1 file changed, 13 insertions(+), 13 deletions(-)

diff --git a/hw/virtio/virtio-balloon.c b/hw/virtio/virtio-balloon.c
index 04a7e6c772..f206cc8bf7 100644
--- a/hw/virtio/virtio-balloon.c
+++ b/hw/virtio/virtio-balloon.c
@@ -41,24 +41,23 @@ struct PartiallyBalloonedPage {
 };
 
 static void balloon_inflate_page(VirtIOBalloon *balloon,
-                                 MemoryRegion *mr, hwaddr offset)
+                                 MemoryRegion *mr, hwaddr mr_offset)
 {
-    void *addr = memory_region_get_ram_ptr(mr) + offset;
+    void *addr = memory_region_get_ram_ptr(mr) + mr_offset;
+    ram_addr_t rb_offset, rb_aligned_offset;
     RAMBlock *rb;
     size_t rb_page_size;
     int subpages;
-    ram_addr_t ram_offset, host_page_base;
 
     /* XXX is there a better way to get to the RAMBlock than via a
      * host address? */
-    rb = qemu_ram_block_from_host(addr, false, &ram_offset);
+    rb = qemu_ram_block_from_host(addr, false, &rb_offset);
     rb_page_size = qemu_ram_pagesize(rb);
-    host_page_base = ram_offset & ~(rb_page_size - 1);
 
     if (rb_page_size == BALLOON_PAGE_SIZE) {
         /* Easy case */
 
-        ram_block_discard_range(rb, ram_offset, rb_page_size);
+        ram_block_discard_range(rb, rb_offset, rb_page_size);
         /* We ignore errors from ram_block_discard_range(), because it
          * has already reported them, and failing to discard a balloon
          * page is not fatal */
@@ -74,11 +73,12 @@ static void balloon_inflate_page(VirtIOBalloon *balloon,
     warn_report_once(
 "Balloon used with backing page size > 4kiB, this may not be reliable");
 
+    rb_aligned_offset = QEMU_ALIGN_DOWN(rb_offset, rb_page_size);
     subpages = rb_page_size / BALLOON_PAGE_SIZE;
 
     if (balloon->pbp
         && (rb != balloon->pbp->rb
-            || host_page_base != balloon->pbp->base)) {
+            || rb_aligned_offset != balloon->pbp->base)) {
         /* We've partially ballooned part of a host page, but now
          * we're trying to balloon part of a different one.  Too hard,
          * give up on the old partial page */
@@ -91,10 +91,10 @@ static void balloon_inflate_page(VirtIOBalloon *balloon,
         size_t bitlen = BITS_TO_LONGS(subpages) * sizeof(unsigned long);
         balloon->pbp = g_malloc0(sizeof(PartiallyBalloonedPage) + bitlen);
         balloon->pbp->rb = rb;
-        balloon->pbp->base = host_page_base;
+        balloon->pbp->base = rb_aligned_offset;
     }
 
-    set_bit((ram_offset - balloon->pbp->base) / BALLOON_PAGE_SIZE,
+    set_bit((rb_offset - balloon->pbp->base) / BALLOON_PAGE_SIZE,
             balloon->pbp->bitmap);
 
     if (bitmap_full(balloon->pbp->bitmap, subpages)) {
@@ -112,18 +112,18 @@ static void balloon_inflate_page(VirtIOBalloon *balloon,
 }
 
 static void balloon_deflate_page(VirtIOBalloon *balloon,
-                                 MemoryRegion *mr, hwaddr offset)
+                                 MemoryRegion *mr, hwaddr mr_offset)
 {
-    void *addr = memory_region_get_ram_ptr(mr) + offset;
+    void *addr = memory_region_get_ram_ptr(mr) + mr_offset;
+    ram_addr_t rb_offset;
     RAMBlock *rb;
     size_t rb_page_size;
-    ram_addr_t ram_offset;
     void *host_addr;
     int ret;
 
     /* XXX is there a better way to get to the RAMBlock than via a
      * host address? */
-    rb = qemu_ram_block_from_host(addr, false, &ram_offset);
+    rb = qemu_ram_block_from_host(addr, false, &rb_offset);
     rb_page_size = qemu_ram_pagesize(rb);
 
     if (balloon->pbp) {
-- 
2.21.0



^ permalink raw reply related	[flat|nested] 53+ messages in thread

* [Qemu-devel] [PATCH-for-4.1 v4 5/7] virtio-balloon: Rework pbp tracking data
  2019-07-25 11:36 [Qemu-devel] [PATCH-for-4.1 v4 0/7] virtio-balloon: fixes David Hildenbrand
                   ` (3 preceding siblings ...)
  2019-07-25 11:36 ` [Qemu-devel] [PATCH-for-4.1 v4 4/7] virtio-balloon: Better names for offset variables in inflate/deflate code David Hildenbrand
@ 2019-07-25 11:36 ` David Hildenbrand
  2019-07-26  8:08   ` David Gibson
  2019-07-25 11:36 ` [Qemu-devel] [PATCH-for-4.1 v4 6/7] virtio-balloon: Use temporary PBP only David Hildenbrand
  2019-07-25 11:36 ` [Qemu-devel] [PATCH-for-4.1 v4 7/7] virtio-balloon: No need to track subpages for the PBP anymore David Hildenbrand
  6 siblings, 1 reply; 53+ messages in thread
From: David Hildenbrand @ 2019-07-25 11:36 UTC (permalink / raw)
  To: qemu-devel
  Cc: Michael S . Tsirkin, David Hildenbrand, qemu-stable,
	Stefan Hajnoczi, Igor Mammedov, David Gibson

Using the address of a RAMBlock to test for a matching pbp is not really
safe. Instead, let's use the guest physical address of the base page
along with the page size (via the number of subpages).

Also, let's allocate the bitmap separately. This makes the code
easier to read and maintain - we can reuse bitmap_new().

Prepare the code to move the PBP out of the device.

Fixes: ed48c59875b6 ("virtio-balloon: Safely handle BALLOON_PAGE_SIZE <
                     host page size")
Fixes: b27b32391404 ("virtio-balloon: Fix possible guest memory corruption
                     with inflates & deflates")
Cc: qemu-stable@nongnu.org #v4.0.0
Signed-off-by: David Hildenbrand <david@redhat.com>
---
 hw/virtio/virtio-balloon.c | 69 +++++++++++++++++++++++++-------------
 1 file changed, 46 insertions(+), 23 deletions(-)

diff --git a/hw/virtio/virtio-balloon.c b/hw/virtio/virtio-balloon.c
index f206cc8bf7..40d493a31a 100644
--- a/hw/virtio/virtio-balloon.c
+++ b/hw/virtio/virtio-balloon.c
@@ -35,16 +35,44 @@
 #define BALLOON_PAGE_SIZE  (1 << VIRTIO_BALLOON_PFN_SHIFT)
 
 struct PartiallyBalloonedPage {
-    RAMBlock *rb;
-    ram_addr_t base;
-    unsigned long bitmap[];
+    ram_addr_t base_gpa;
+    long subpages;
+    unsigned long *bitmap;
 };
 
+static void virtio_balloon_pbp_free(PartiallyBalloonedPage *pbp)
+{
+    if (!pbp) {
+        return;
+    }
+    g_free(pbp->bitmap);
+    g_free(pbp);
+}
+
+static PartiallyBalloonedPage *virtio_balloon_pbp_alloc(ram_addr_t base_gpa,
+                                                        long subpages)
+{
+    PartiallyBalloonedPage *pbp = g_new0(PartiallyBalloonedPage, 1);
+
+    pbp->base_gpa = base_gpa;
+    pbp->subpages = subpages;
+    pbp->bitmap = bitmap_new(subpages);
+
+    return pbp;
+}
+
+static bool virtio_balloon_pbp_matches(PartiallyBalloonedPage *pbp,
+                                       ram_addr_t base_gpa, long subpages)
+{
+    return pbp->subpages == subpages && pbp->base_gpa == base_gpa;
+}
+
 static void balloon_inflate_page(VirtIOBalloon *balloon,
                                  MemoryRegion *mr, hwaddr mr_offset)
 {
     void *addr = memory_region_get_ram_ptr(mr) + mr_offset;
-    ram_addr_t rb_offset, rb_aligned_offset;
+    ram_addr_t rb_offset, rb_aligned_offset, base_gpa;
+    PartiallyBalloonedPage **pbp = &balloon->pbp;
     RAMBlock *rb;
     size_t rb_page_size;
     int subpages;
@@ -75,39 +103,34 @@ static void balloon_inflate_page(VirtIOBalloon *balloon,
 
     rb_aligned_offset = QEMU_ALIGN_DOWN(rb_offset, rb_page_size);
     subpages = rb_page_size / BALLOON_PAGE_SIZE;
+    base_gpa = memory_region_get_ram_addr(mr) + mr_offset -
+               (rb_offset - rb_aligned_offset);
 
-    if (balloon->pbp
-        && (rb != balloon->pbp->rb
-            || rb_aligned_offset != balloon->pbp->base)) {
+    if (*pbp && !virtio_balloon_pbp_matches(*pbp, base_gpa, subpages)) {
         /* We've partially ballooned part of a host page, but now
          * we're trying to balloon part of a different one.  Too hard,
          * give up on the old partial page */
-        g_free(balloon->pbp);
-        balloon->pbp = NULL;
+        virtio_balloon_pbp_free(*pbp);
+        *pbp = NULL;
     }
 
-    if (!balloon->pbp) {
-        /* Starting on a new host page */
-        size_t bitlen = BITS_TO_LONGS(subpages) * sizeof(unsigned long);
-        balloon->pbp = g_malloc0(sizeof(PartiallyBalloonedPage) + bitlen);
-        balloon->pbp->rb = rb;
-        balloon->pbp->base = rb_aligned_offset;
+    if (!*pbp) {
+        *pbp = virtio_balloon_pbp_alloc(base_gpa, subpages);
     }
 
-    set_bit((rb_offset - balloon->pbp->base) / BALLOON_PAGE_SIZE,
-            balloon->pbp->bitmap);
+    set_bit((rb_offset - rb_aligned_offset) / BALLOON_PAGE_SIZE,
+            (*pbp)->bitmap);
 
-    if (bitmap_full(balloon->pbp->bitmap, subpages)) {
+    if (bitmap_full((*pbp)->bitmap, subpages)) {
         /* We've accumulated a full host page, we can actually discard
          * it now */
 
-        ram_block_discard_range(rb, balloon->pbp->base, rb_page_size);
+        ram_block_discard_range(rb, rb_aligned_offset, rb_page_size);
         /* We ignore errors from ram_block_discard_range(), because it
          * has already reported them, and failing to discard a balloon
          * page is not fatal */
-
-        g_free(balloon->pbp);
-        balloon->pbp = NULL;
+        virtio_balloon_pbp_free(*pbp);
+        *pbp = NULL;
     }
 }
 
@@ -128,7 +151,7 @@ static void balloon_deflate_page(VirtIOBalloon *balloon,
 
     if (balloon->pbp) {
         /* Let's play safe and always reset the pbp on deflation requests. */
-        g_free(balloon->pbp);
+        virtio_balloon_pbp_free(balloon->pbp);
         balloon->pbp = NULL;
     }
 
-- 
2.21.0



^ permalink raw reply related	[flat|nested] 53+ messages in thread

* [Qemu-devel] [PATCH-for-4.1 v4 6/7] virtio-balloon: Use temporary PBP only
  2019-07-25 11:36 [Qemu-devel] [PATCH-for-4.1 v4 0/7] virtio-balloon: fixes David Hildenbrand
                   ` (4 preceding siblings ...)
  2019-07-25 11:36 ` [Qemu-devel] [PATCH-for-4.1 v4 5/7] virtio-balloon: Rework pbp tracking data David Hildenbrand
@ 2019-07-25 11:36 ` David Hildenbrand
  2019-07-25 11:53   ` Michael S. Tsirkin
  2019-07-25 11:36 ` [Qemu-devel] [PATCH-for-4.1 v4 7/7] virtio-balloon: No need to track subpages for the PBP anymore David Hildenbrand
  6 siblings, 1 reply; 53+ messages in thread
From: David Hildenbrand @ 2019-07-25 11:36 UTC (permalink / raw)
  To: qemu-devel
  Cc: Michael S . Tsirkin, David Hildenbrand, qemu-stable,
	Stefan Hajnoczi, Igor Mammedov, David Gibson

We still have multiple issues in the current code
- The PBP is not freed during unrealize()
- The PBP is not reset on device resets: After a reset, the PBP is stale.
- We are not indicating VIRTIO_BALLOON_F_MUST_TELL_HOST, therefore
  guests (esp. legacy guests) will reuse pages without deflating,
  turning the PBP stale. Adding that would require compat handling.

Instead, let's use the PBP only temporarily, when processing one bulk of
inflation requests. This will keep guest_page_size > 4k working (with
Linux guests). There is nothing to do for deflation requests anymore.
The pbp is only used for a limited amount of time.

Fixes: ed48c59875b6 ("virtio-balloon: Safely handle BALLOON_PAGE_SIZE <
                     host page size")
Cc: qemu-stable@nongnu.org #v4.0.0
Suggested-by: Michael S. Tsirkin <mst@redhat.com>
Acked-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: David Hildenbrand <david@redhat.com>
---
 hw/virtio/virtio-balloon.c         | 21 +++++++++------------
 include/hw/virtio/virtio-balloon.h |  3 ---
 2 files changed, 9 insertions(+), 15 deletions(-)

diff --git a/hw/virtio/virtio-balloon.c b/hw/virtio/virtio-balloon.c
index 40d493a31a..a6282d58d4 100644
--- a/hw/virtio/virtio-balloon.c
+++ b/hw/virtio/virtio-balloon.c
@@ -34,11 +34,11 @@
 
 #define BALLOON_PAGE_SIZE  (1 << VIRTIO_BALLOON_PFN_SHIFT)
 
-struct PartiallyBalloonedPage {
+typedef struct PartiallyBalloonedPage {
     ram_addr_t base_gpa;
     long subpages;
     unsigned long *bitmap;
-};
+} PartiallyBalloonedPage;
 
 static void virtio_balloon_pbp_free(PartiallyBalloonedPage *pbp)
 {
@@ -68,11 +68,11 @@ static bool virtio_balloon_pbp_matches(PartiallyBalloonedPage *pbp,
 }
 
 static void balloon_inflate_page(VirtIOBalloon *balloon,
-                                 MemoryRegion *mr, hwaddr mr_offset)
+                                 MemoryRegion *mr, hwaddr mr_offset,
+                                 PartiallyBalloonedPage **pbp)
 {
     void *addr = memory_region_get_ram_ptr(mr) + mr_offset;
     ram_addr_t rb_offset, rb_aligned_offset, base_gpa;
-    PartiallyBalloonedPage **pbp = &balloon->pbp;
     RAMBlock *rb;
     size_t rb_page_size;
     int subpages;
@@ -149,12 +149,6 @@ static void balloon_deflate_page(VirtIOBalloon *balloon,
     rb = qemu_ram_block_from_host(addr, false, &rb_offset);
     rb_page_size = qemu_ram_pagesize(rb);
 
-    if (balloon->pbp) {
-        /* Let's play safe and always reset the pbp on deflation requests. */
-        virtio_balloon_pbp_free(balloon->pbp);
-        balloon->pbp = NULL;
-    }
-
     host_addr = (void *)((uintptr_t)addr & ~(rb_page_size - 1));
 
     /* When a page is deflated, we hint the whole host page it lives
@@ -336,6 +330,7 @@ static void balloon_stats_set_poll_interval(Object *obj, Visitor *v,
 static void virtio_balloon_handle_output(VirtIODevice *vdev, VirtQueue *vq)
 {
     VirtIOBalloon *s = VIRTIO_BALLOON(vdev);
+    PartiallyBalloonedPage *pbp = NULL;
     VirtQueueElement *elem;
     MemoryRegionSection section;
 
@@ -344,7 +339,7 @@ static void virtio_balloon_handle_output(VirtIODevice *vdev, VirtQueue *vq)
         uint32_t pfn;
         elem = virtqueue_pop(vq, sizeof(VirtQueueElement));
         if (!elem) {
-            return;
+            break;
         }
 
         while (iov_to_buf(elem->out_sg, elem->out_num, offset, &pfn, 4) == 4) {
@@ -373,7 +368,7 @@ static void virtio_balloon_handle_output(VirtIODevice *vdev, VirtQueue *vq)
             if (!qemu_balloon_is_inhibited()) {
                 if (vq == s->ivq) {
                     balloon_inflate_page(s, section.mr,
-                                         section.offset_within_region);
+                                         section.offset_within_region, &pbp);
                 } else if (vq == s->dvq) {
                     balloon_deflate_page(s, section.mr, section.offset_within_region);
                 } else {
@@ -387,6 +382,8 @@ static void virtio_balloon_handle_output(VirtIODevice *vdev, VirtQueue *vq)
         virtio_notify(vdev, vq);
         g_free(elem);
     }
+
+    virtio_balloon_pbp_free(pbp);
 }
 
 static void virtio_balloon_receive_stats(VirtIODevice *vdev, VirtQueue *vq)
diff --git a/include/hw/virtio/virtio-balloon.h b/include/hw/virtio/virtio-balloon.h
index 5a99293a45..d1c968d237 100644
--- a/include/hw/virtio/virtio-balloon.h
+++ b/include/hw/virtio/virtio-balloon.h
@@ -33,8 +33,6 @@ typedef struct virtio_balloon_stat_modern {
        uint64_t val;
 } VirtIOBalloonStatModern;
 
-typedef struct PartiallyBalloonedPage PartiallyBalloonedPage;
-
 enum virtio_balloon_free_page_report_status {
     FREE_PAGE_REPORT_S_STOP = 0,
     FREE_PAGE_REPORT_S_REQUESTED = 1,
@@ -70,7 +68,6 @@ typedef struct VirtIOBalloon {
     int64_t stats_last_update;
     int64_t stats_poll_interval;
     uint32_t host_features;
-    PartiallyBalloonedPage *pbp;
 
     bool qemu_4_0_config_size;
 } VirtIOBalloon;
-- 
2.21.0



^ permalink raw reply related	[flat|nested] 53+ messages in thread

* [Qemu-devel] [PATCH-for-4.1 v4 7/7] virtio-balloon: No need to track subpages for the PBP anymore
  2019-07-25 11:36 [Qemu-devel] [PATCH-for-4.1 v4 0/7] virtio-balloon: fixes David Hildenbrand
                   ` (5 preceding siblings ...)
  2019-07-25 11:36 ` [Qemu-devel] [PATCH-for-4.1 v4 6/7] virtio-balloon: Use temporary PBP only David Hildenbrand
@ 2019-07-25 11:36 ` David Hildenbrand
  2019-07-25 15:32   ` [Qemu-devel] [PULL 11/12] virtio-balloon: don't track subpages for the PBP Michael S. Tsirkin
  2019-07-26  8:10   ` [Qemu-devel] [PATCH-for-4.1 v4 7/7] virtio-balloon: No need to track subpages for the PBP anymore David Gibson
  6 siblings, 2 replies; 53+ messages in thread
From: David Hildenbrand @ 2019-07-25 11:36 UTC (permalink / raw)
  To: qemu-devel
  Cc: Michael S . Tsirkin, David Hildenbrand, qemu-stable,
	Stefan Hajnoczi, Igor Mammedov, David Gibson

As ramblocks cannot get removed/readded while we are processing a bulk
of inflation requests, there is no more need to track the page size
in form of the number of subpages.

Suggested-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: David Hildenbrand <david@redhat.com>
---
 hw/virtio/virtio-balloon.c | 8 +++-----
 1 file changed, 3 insertions(+), 5 deletions(-)

diff --git a/hw/virtio/virtio-balloon.c b/hw/virtio/virtio-balloon.c
index a6282d58d4..fe9664e42c 100644
--- a/hw/virtio/virtio-balloon.c
+++ b/hw/virtio/virtio-balloon.c
@@ -36,7 +36,6 @@
 
 typedef struct PartiallyBalloonedPage {
     ram_addr_t base_gpa;
-    long subpages;
     unsigned long *bitmap;
 } PartiallyBalloonedPage;
 
@@ -55,16 +54,15 @@ static PartiallyBalloonedPage *virtio_balloon_pbp_alloc(ram_addr_t base_gpa,
     PartiallyBalloonedPage *pbp = g_new0(PartiallyBalloonedPage, 1);
 
     pbp->base_gpa = base_gpa;
-    pbp->subpages = subpages;
     pbp->bitmap = bitmap_new(subpages);
 
     return pbp;
 }
 
 static bool virtio_balloon_pbp_matches(PartiallyBalloonedPage *pbp,
-                                       ram_addr_t base_gpa, long subpages)
+                                       ram_addr_t base_gpa)
 {
-    return pbp->subpages == subpages && pbp->base_gpa == base_gpa;
+    return pbp->base_gpa == base_gpa;
 }
 
 static void balloon_inflate_page(VirtIOBalloon *balloon,
@@ -106,7 +104,7 @@ static void balloon_inflate_page(VirtIOBalloon *balloon,
     base_gpa = memory_region_get_ram_addr(mr) + mr_offset -
                (rb_offset - rb_aligned_offset);
 
-    if (*pbp && !virtio_balloon_pbp_matches(*pbp, base_gpa, subpages)) {
+    if (*pbp && !virtio_balloon_pbp_matches(*pbp, base_gpa)) {
         /* We've partially ballooned part of a host page, but now
          * we're trying to balloon part of a different one.  Too hard,
          * give up on the old partial page */
-- 
2.21.0



^ permalink raw reply related	[flat|nested] 53+ messages in thread

* Re: [Qemu-devel] [PATCH-for-4.1 v4 6/7] virtio-balloon: Use temporary PBP only
  2019-07-25 11:36 ` [Qemu-devel] [PATCH-for-4.1 v4 6/7] virtio-balloon: Use temporary PBP only David Hildenbrand
@ 2019-07-25 11:53   ` Michael S. Tsirkin
  2019-07-25 11:56     ` David Hildenbrand
  0 siblings, 1 reply; 53+ messages in thread
From: Michael S. Tsirkin @ 2019-07-25 11:53 UTC (permalink / raw)
  To: David Hildenbrand
  Cc: Igor Mammedov, qemu-stable, qemu-devel, Stefan Hajnoczi, David Gibson

On Thu, Jul 25, 2019 at 01:36:37PM +0200, David Hildenbrand wrote:
> We still have multiple issues in the current code
> - The PBP is not freed during unrealize()
> - The PBP is not reset on device resets: After a reset, the PBP is stale.
> - We are not indicating VIRTIO_BALLOON_F_MUST_TELL_HOST, therefore
>   guests (esp. legacy guests) will reuse pages without deflating,
>   turning the PBP stale. Adding that would require compat handling.
> 
> Instead, let's use the PBP only temporarily, when processing one bulk of
> inflation requests. This will keep guest_page_size > 4k working (with
> Linux guests). There is nothing to do for deflation requests anymore.
> The pbp is only used for a limited amount of time.
> 
> Fixes: ed48c59875b6 ("virtio-balloon: Safely handle BALLOON_PAGE_SIZE <
>                      host page size")
> Cc: qemu-stable@nongnu.org #v4.0.0
> Suggested-by: Michael S. Tsirkin <mst@redhat.com>
> Acked-by: David Gibson <david@gibson.dropbear.id.au>
> Signed-off-by: David Hildenbrand <david@redhat.com>
> ---
>  hw/virtio/virtio-balloon.c         | 21 +++++++++------------
>  include/hw/virtio/virtio-balloon.h |  3 ---
>  2 files changed, 9 insertions(+), 15 deletions(-)
> 
> diff --git a/hw/virtio/virtio-balloon.c b/hw/virtio/virtio-balloon.c
> index 40d493a31a..a6282d58d4 100644
> --- a/hw/virtio/virtio-balloon.c
> +++ b/hw/virtio/virtio-balloon.c
> @@ -34,11 +34,11 @@
>  
>  #define BALLOON_PAGE_SIZE  (1 << VIRTIO_BALLOON_PFN_SHIFT)
>  
> -struct PartiallyBalloonedPage {
> +typedef struct PartiallyBalloonedPage {
>      ram_addr_t base_gpa;
>      long subpages;
>      unsigned long *bitmap;
> -};
> +} PartiallyBalloonedPage;
>  
>  static void virtio_balloon_pbp_free(PartiallyBalloonedPage *pbp)
>  {
> @@ -68,11 +68,11 @@ static bool virtio_balloon_pbp_matches(PartiallyBalloonedPage *pbp,
>  }
>  
>  static void balloon_inflate_page(VirtIOBalloon *balloon,
> -                                 MemoryRegion *mr, hwaddr mr_offset)
> +                                 MemoryRegion *mr, hwaddr mr_offset,
> +                                 PartiallyBalloonedPage **pbp)
>  {
>      void *addr = memory_region_get_ram_ptr(mr) + mr_offset;
>      ram_addr_t rb_offset, rb_aligned_offset, base_gpa;
> -    PartiallyBalloonedPage **pbp = &balloon->pbp;
>      RAMBlock *rb;
>      size_t rb_page_size;
>      int subpages;
> @@ -149,12 +149,6 @@ static void balloon_deflate_page(VirtIOBalloon *balloon,
>      rb = qemu_ram_block_from_host(addr, false, &rb_offset);
>      rb_page_size = qemu_ram_pagesize(rb);
>  
> -    if (balloon->pbp) {
> -        /* Let's play safe and always reset the pbp on deflation requests. */
> -        virtio_balloon_pbp_free(balloon->pbp);
> -        balloon->pbp = NULL;
> -    }
> -
>      host_addr = (void *)((uintptr_t)addr & ~(rb_page_size - 1));
>  
>      /* When a page is deflated, we hint the whole host page it lives
> @@ -336,6 +330,7 @@ static void balloon_stats_set_poll_interval(Object *obj, Visitor *v,
>  static void virtio_balloon_handle_output(VirtIODevice *vdev, VirtQueue *vq)
>  {
>      VirtIOBalloon *s = VIRTIO_BALLOON(vdev);
> +    PartiallyBalloonedPage *pbp = NULL;
>      VirtQueueElement *elem;
>      MemoryRegionSection section;
>  
> @@ -344,7 +339,7 @@ static void virtio_balloon_handle_output(VirtIODevice *vdev, VirtQueue *vq)
>          uint32_t pfn;
>          elem = virtqueue_pop(vq, sizeof(VirtQueueElement));
>          if (!elem) {
> -            return;
> +            break;
>          }
>  
>          while (iov_to_buf(elem->out_sg, elem->out_num, offset, &pfn, 4) == 4) {
> @@ -373,7 +368,7 @@ static void virtio_balloon_handle_output(VirtIODevice *vdev, VirtQueue *vq)
>              if (!qemu_balloon_is_inhibited()) {
>                  if (vq == s->ivq) {
>                      balloon_inflate_page(s, section.mr,
> -                                         section.offset_within_region);
> +                                         section.offset_within_region, &pbp);
>                  } else if (vq == s->dvq) {
>                      balloon_deflate_page(s, section.mr, section.offset_within_region);
>                  } else {
> @@ -387,6 +382,8 @@ static void virtio_balloon_handle_output(VirtIODevice *vdev, VirtQueue *vq)
>          virtio_notify(vdev, vq);
>          g_free(elem);
>      }
> +
> +    virtio_balloon_pbp_free(pbp);
>  }
>  
>  static void virtio_balloon_receive_stats(VirtIODevice *vdev, VirtQueue *vq)

Oops this is not early enough.
We must not track pbp across elements: once we push an element
guest can reuse the page.

I propose changing virtio_balloon_pbp_free to get
PartiallyBalloonedPage** and set it to NULL.

Call that within the loop after g_free(elem);




> diff --git a/include/hw/virtio/virtio-balloon.h b/include/hw/virtio/virtio-balloon.h
> index 5a99293a45..d1c968d237 100644
> --- a/include/hw/virtio/virtio-balloon.h
> +++ b/include/hw/virtio/virtio-balloon.h
> @@ -33,8 +33,6 @@ typedef struct virtio_balloon_stat_modern {
>         uint64_t val;
>  } VirtIOBalloonStatModern;
>  
> -typedef struct PartiallyBalloonedPage PartiallyBalloonedPage;
> -
>  enum virtio_balloon_free_page_report_status {
>      FREE_PAGE_REPORT_S_STOP = 0,
>      FREE_PAGE_REPORT_S_REQUESTED = 1,
> @@ -70,7 +68,6 @@ typedef struct VirtIOBalloon {
>      int64_t stats_last_update;
>      int64_t stats_poll_interval;
>      uint32_t host_features;
> -    PartiallyBalloonedPage *pbp;
>  
>      bool qemu_4_0_config_size;
>  } VirtIOBalloon;
> -- 
> 2.21.0


^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: [Qemu-devel] [PATCH-for-4.1 v4 6/7] virtio-balloon: Use temporary PBP only
  2019-07-25 11:53   ` Michael S. Tsirkin
@ 2019-07-25 11:56     ` David Hildenbrand
  0 siblings, 0 replies; 53+ messages in thread
From: David Hildenbrand @ 2019-07-25 11:56 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: Igor Mammedov, qemu-stable, qemu-devel, Stefan Hajnoczi, David Gibson

On 25.07.19 13:53, Michael S. Tsirkin wrote:
> On Thu, Jul 25, 2019 at 01:36:37PM +0200, David Hildenbrand wrote:
>> We still have multiple issues in the current code
>> - The PBP is not freed during unrealize()
>> - The PBP is not reset on device resets: After a reset, the PBP is stale.
>> - We are not indicating VIRTIO_BALLOON_F_MUST_TELL_HOST, therefore
>>   guests (esp. legacy guests) will reuse pages without deflating,
>>   turning the PBP stale. Adding that would require compat handling.
>>
>> Instead, let's use the PBP only temporarily, when processing one bulk of
>> inflation requests. This will keep guest_page_size > 4k working (with
>> Linux guests). There is nothing to do for deflation requests anymore.
>> The pbp is only used for a limited amount of time.
>>
>> Fixes: ed48c59875b6 ("virtio-balloon: Safely handle BALLOON_PAGE_SIZE <
>>                      host page size")
>> Cc: qemu-stable@nongnu.org #v4.0.0
>> Suggested-by: Michael S. Tsirkin <mst@redhat.com>
>> Acked-by: David Gibson <david@gibson.dropbear.id.au>
>> Signed-off-by: David Hildenbrand <david@redhat.com>
>> ---
>>  hw/virtio/virtio-balloon.c         | 21 +++++++++------------
>>  include/hw/virtio/virtio-balloon.h |  3 ---
>>  2 files changed, 9 insertions(+), 15 deletions(-)
>>
>> diff --git a/hw/virtio/virtio-balloon.c b/hw/virtio/virtio-balloon.c
>> index 40d493a31a..a6282d58d4 100644
>> --- a/hw/virtio/virtio-balloon.c
>> +++ b/hw/virtio/virtio-balloon.c
>> @@ -34,11 +34,11 @@
>>  
>>  #define BALLOON_PAGE_SIZE  (1 << VIRTIO_BALLOON_PFN_SHIFT)
>>  
>> -struct PartiallyBalloonedPage {
>> +typedef struct PartiallyBalloonedPage {
>>      ram_addr_t base_gpa;
>>      long subpages;
>>      unsigned long *bitmap;
>> -};
>> +} PartiallyBalloonedPage;
>>  
>>  static void virtio_balloon_pbp_free(PartiallyBalloonedPage *pbp)
>>  {
>> @@ -68,11 +68,11 @@ static bool virtio_balloon_pbp_matches(PartiallyBalloonedPage *pbp,
>>  }
>>  
>>  static void balloon_inflate_page(VirtIOBalloon *balloon,
>> -                                 MemoryRegion *mr, hwaddr mr_offset)
>> +                                 MemoryRegion *mr, hwaddr mr_offset,
>> +                                 PartiallyBalloonedPage **pbp)
>>  {
>>      void *addr = memory_region_get_ram_ptr(mr) + mr_offset;
>>      ram_addr_t rb_offset, rb_aligned_offset, base_gpa;
>> -    PartiallyBalloonedPage **pbp = &balloon->pbp;
>>      RAMBlock *rb;
>>      size_t rb_page_size;
>>      int subpages;
>> @@ -149,12 +149,6 @@ static void balloon_deflate_page(VirtIOBalloon *balloon,
>>      rb = qemu_ram_block_from_host(addr, false, &rb_offset);
>>      rb_page_size = qemu_ram_pagesize(rb);
>>  
>> -    if (balloon->pbp) {
>> -        /* Let's play safe and always reset the pbp on deflation requests. */
>> -        virtio_balloon_pbp_free(balloon->pbp);
>> -        balloon->pbp = NULL;
>> -    }
>> -
>>      host_addr = (void *)((uintptr_t)addr & ~(rb_page_size - 1));
>>  
>>      /* When a page is deflated, we hint the whole host page it lives
>> @@ -336,6 +330,7 @@ static void balloon_stats_set_poll_interval(Object *obj, Visitor *v,
>>  static void virtio_balloon_handle_output(VirtIODevice *vdev, VirtQueue *vq)
>>  {
>>      VirtIOBalloon *s = VIRTIO_BALLOON(vdev);
>> +    PartiallyBalloonedPage *pbp = NULL;
>>      VirtQueueElement *elem;
>>      MemoryRegionSection section;
>>  
>> @@ -344,7 +339,7 @@ static void virtio_balloon_handle_output(VirtIODevice *vdev, VirtQueue *vq)
>>          uint32_t pfn;
>>          elem = virtqueue_pop(vq, sizeof(VirtQueueElement));
>>          if (!elem) {
>> -            return;
>> +            break;
>>          }
>>  
>>          while (iov_to_buf(elem->out_sg, elem->out_num, offset, &pfn, 4) == 4) {
>> @@ -373,7 +368,7 @@ static void virtio_balloon_handle_output(VirtIODevice *vdev, VirtQueue *vq)
>>              if (!qemu_balloon_is_inhibited()) {
>>                  if (vq == s->ivq) {
>>                      balloon_inflate_page(s, section.mr,
>> -                                         section.offset_within_region);
>> +                                         section.offset_within_region, &pbp);
>>                  } else if (vq == s->dvq) {
>>                      balloon_deflate_page(s, section.mr, section.offset_within_region);
>>                  } else {
>> @@ -387,6 +382,8 @@ static void virtio_balloon_handle_output(VirtIODevice *vdev, VirtQueue *vq)
>>          virtio_notify(vdev, vq);
>>          g_free(elem);
>>      }
>> +
>> +    virtio_balloon_pbp_free(pbp);
>>  }
>>  
>>  static void virtio_balloon_receive_stats(VirtIODevice *vdev, VirtQueue *vq)
> 
> Oops this is not early enough.
> We must not track pbp across elements: once we push an element
> guest can reuse the page.
> 

Right, it has to go into the loop.

-- 

Thanks,

David / dhildenb


^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: [Qemu-devel] [PATCH-for-4.1 v4 1/7] virtio-balloon: Fix wrong sign extension of PFNs
  2019-07-25 11:36 ` [Qemu-devel] [PATCH-for-4.1 v4 1/7] virtio-balloon: Fix wrong sign extension of PFNs David Hildenbrand
@ 2019-07-25 12:36   ` Pankaj Gupta
  0 siblings, 0 replies; 53+ messages in thread
From: Pankaj Gupta @ 2019-07-25 12:36 UTC (permalink / raw)
  To: David Hildenbrand
  Cc: Michael S . Tsirkin, qemu-stable, qemu-devel, Stefan Hajnoczi,
	Igor Mammedov, David Gibson


> 
> If we directly cast from int to uint64_t, we will first sign-extend to
> an int64_t, which is wrong. We actually want to treat the PFNs like
> unsigned values.
> 
> As far as I can see, this dates back to the initial virtio-balloon
> commit, but wasn't triggered as fairly big guests would be required.
> 
> Cc: qemu-stable@nongnu.org
> Reported-by: Michael S. Tsirkin <mst@redhat.com>
> Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
> Signed-off-by: David Hildenbrand <david@redhat.com>
> ---
>  hw/virtio/virtio-balloon.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/hw/virtio/virtio-balloon.c b/hw/virtio/virtio-balloon.c
> index e85d1c0d5c..515abf6553 100644
> --- a/hw/virtio/virtio-balloon.c
> +++ b/hw/virtio/virtio-balloon.c
> @@ -343,8 +343,8 @@ static void virtio_balloon_handle_output(VirtIODevice
> *vdev, VirtQueue *vq)
>          }
>  
>          while (iov_to_buf(elem->out_sg, elem->out_num, offset, &pfn, 4) ==
>          4) {                                                         
> +            unsigned int p = virtio_ldl_p(vdev, &pfn);
>              hwaddr pa;
> -            int p = virtio_ldl_p(vdev, &pfn);
>  
>              pa = (hwaddr) p << VIRTIO_BALLOON_PFN_SHIFT;
>              offset += 4;
> --
> 2.21.0

Reviewed-by: Pankaj Gupta <pagupta@redhat.com>

> 
> 
> 


^ permalink raw reply	[flat|nested] 53+ messages in thread

* [Qemu-devel] [PULL 00/12] virtio, pc: fixes, cleanups
@ 2019-07-25 15:31 Michael S. Tsirkin
  2019-06-02 11:42 ` [Qemu-devel] [PATCH] ioapic: kvm: Skip route updates for masked pins Jan Kiszka
                   ` (5 more replies)
  0 siblings, 6 replies; 53+ messages in thread
From: Michael S. Tsirkin @ 2019-07-25 15:31 UTC (permalink / raw)
  To: qemu-devel; +Cc: Peter Maydell

The following changes since commit bf8b024372bf8abf5a9f40bfa65eeefad23ff988:

  Update version for v4.1.0-rc2 release (2019-07-23 18:28:08 +0100)

are available in the Git repository at:

  git://git.kernel.org/pub/scm/virt/kvm/mst/qemu.git tags/for_upstream

for you to fetch changes up to 1b47b37c33ec01ae1efc527f4c97f97f93723bc4:

  virtio-balloon: free pbp more aggressively (2019-07-25 11:19:25 -0400)

----------------------------------------------------------------
virtio, pc: fixes, cleanups

A bunch of fixes all over the place.

Signed-off-by: Michael S. Tsirkin <mst@redhat.com>

----------------------------------------------------------------
David Hildenbrand (7):
      virtio-balloon: Fix wrong sign extension of PFNs
      virtio-balloon: Fix QEMU crashes on pagesize > BALLOON_PAGE_SIZE
      virtio-balloon: Simplify deflate with pbp
      virtio-balloon: Better names for offset variables in inflate/deflate code
      virtio-balloon: Rework pbp tracking data
      virtio-balloon: Use temporary PBP only
      virtio-balloon: don't track subpages for the PBP

Evgeny Yakovlev (2):
      i386/acpi: fix gint overflow in crs_range_compare
      i386/acpi: show PCI Express bus on pxb-pcie expanders

Jan Kiszka (1):
      ioapic: kvm: Skip route updates for masked pins

Michael S. Tsirkin (1):
      virtio-balloon: free pbp more aggressively

Stefan Hajnoczi (1):
      docs: clarify multiqueue vs multiple virtqueues

 include/hw/virtio/virtio-balloon.h |   3 -
 hw/i386/acpi-build.c               |  17 ++++--
 hw/intc/ioapic.c                   |   8 ++-
 hw/virtio/virtio-balloon.c         | 115 ++++++++++++++++++-------------------
 docs/interop/vhost-user.rst        |  17 ++++++
 5 files changed, 90 insertions(+), 70 deletions(-)



^ permalink raw reply	[flat|nested] 53+ messages in thread

* [Qemu-devel] [PULL 01/12] docs: clarify multiqueue vs multiple virtqueues
  2019-06-24  9:13 ` [Qemu-devel] [PATCH] docs: clarify multiqueue vs multiple virtqueues Stefan Hajnoczi
  2019-06-24 10:19   ` Marc-André Lureau
  2019-07-17 10:14   ` Stefan Hajnoczi
@ 2019-07-25 15:31   ` Michael S. Tsirkin
  2 siblings, 0 replies; 53+ messages in thread
From: Michael S. Tsirkin @ 2019-07-25 15:31 UTC (permalink / raw)
  To: qemu-devel; +Cc: Peter Maydell, Stefan Hajnoczi, Marc-André Lureau

From: Stefan Hajnoczi <stefanha@redhat.com>

The vhost-user specification does not explain when
VHOST_USER_PROTOCOL_F_MQ must be implemented.  This may lead
implementors of vhost-user masters to believe that this protocol feature
is required for any device that has multiple virtqueues.  That would be
a mistake since existing vhost-user slaves offer multiple virtqueues but
do not advertise VHOST_USER_PROTOCOL_F_MQ.

For example, a vhost-net device with one rx/tx queue pair is not
multiqueue.  The slave does not need to advertise
VHOST_USER_PROTOCOL_F_MQ.  Therefore the master must assume it has these
virtqueues and cannot rely on askingt the slave how many virtqueues
exist.

Extend the specification to explain the different between true
multiqueue and regular devices with a fixed virtqueue layout.

Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Message-Id: <20190624091304.666-1-stefanha@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
---
 docs/interop/vhost-user.rst | 17 +++++++++++++++++
 1 file changed, 17 insertions(+)

diff --git a/docs/interop/vhost-user.rst b/docs/interop/vhost-user.rst
index 5750668aba..7827b710aa 100644
--- a/docs/interop/vhost-user.rst
+++ b/docs/interop/vhost-user.rst
@@ -324,6 +324,15 @@ must support changing some configuration aspects on the fly.
 Multiple queue support
 ----------------------
 
+Many devices have a fixed number of virtqueues.  In this case the master
+already knows the number of available virtqueues without communicating with the
+slave.
+
+Some devices do not have a fixed number of virtqueues.  Instead the maximum
+number of virtqueues is chosen by the slave.  The number can depend on host
+resource availability or slave implementation details.  Such devices are called
+multiple queue devices.
+
 Multiple queue support allows the slave to advertise the maximum number of
 queues.  This is treated as a protocol extension, hence the slave has to
 implement protocol features first. The multiple queues feature is supported
@@ -339,6 +348,14 @@ queue in the sent message to identify a specified queue.
 The master enables queues by sending message ``VHOST_USER_SET_VRING_ENABLE``.
 vhost-user-net has historically automatically enabled the first queue pair.
 
+Slaves should always implement the ``VHOST_USER_PROTOCOL_F_MQ`` protocol
+feature, even for devices with a fixed number of virtqueues, since it is simple
+to implement and offers a degree of introspection.
+
+Masters must not rely on the ``VHOST_USER_PROTOCOL_F_MQ`` protocol feature for
+devices with a fixed number of virtqueues.  Only true multiqueue devices
+require this protocol feature.
+
 Migration
 ---------
 
-- 
MST



^ permalink raw reply related	[flat|nested] 53+ messages in thread

* [Qemu-devel] [PULL 02/12] i386/acpi: fix gint overflow in crs_range_compare
  2019-07-18 16:14 ` [Qemu-devel] [PATCH v2] i386/acpi: fix gint overflow in crs_range_compare Evgeny Yakovlev
  2019-07-18 20:30   ` Michael S. Tsirkin
@ 2019-07-25 15:31   ` Michael S. Tsirkin
  1 sibling, 0 replies; 53+ messages in thread
From: Michael S. Tsirkin @ 2019-07-25 15:31 UTC (permalink / raw)
  To: qemu-devel
  Cc: Peter Maydell, Eduardo Habkost, qemu-stable, Igor Mammedov,
	Paolo Bonzini, Evgeny Yakovlev, Richard Henderson

From: Evgeny Yakovlev <wrfsh@yandex-team.ru>

When very large regions (32GB sized in our case, PCI pass-through of GPUs)
are compared substraction result does not fit into gint.

As a result crs_replace_with_free_ranges does not get sorted ranges and
incorrectly computes PCI64 free space regions. Which then makes linux
guest complain about device and PCI64 hole intersection and device
becomes unusable.

Fix that by returning exactly fitting ranges.

Also fix indentation of an entire crs_replace_with_free_ranges to make
checkpatch happy.

Cc: qemu-stable@nongnu.org
Signed-off-by: Evgeny Yakovlev <wrfsh@yandex-team.ru>
Message-Id: <1563466463-26012-1-git-send-email-wrfsh@yandex-team.ru>
Signed-off-by: Evgeny Yakovlev <wrfsh@yandex-team.ru>
---
 hw/i386/acpi-build.c | 12 +++++++++---
 1 file changed, 9 insertions(+), 3 deletions(-)

diff --git a/hw/i386/acpi-build.c b/hw/i386/acpi-build.c
index d281ffa89e..e7b756b51b 100644
--- a/hw/i386/acpi-build.c
+++ b/hw/i386/acpi-build.c
@@ -755,10 +755,16 @@ static void crs_range_set_free(CrsRangeSet *range_set)
 
 static gint crs_range_compare(gconstpointer a, gconstpointer b)
 {
-     CrsRangeEntry *entry_a = *(CrsRangeEntry **)a;
-     CrsRangeEntry *entry_b = *(CrsRangeEntry **)b;
+    CrsRangeEntry *entry_a = *(CrsRangeEntry **)a;
+    CrsRangeEntry *entry_b = *(CrsRangeEntry **)b;
 
-     return (int64_t)entry_a->base - (int64_t)entry_b->base;
+    if (entry_a->base < entry_b->base) {
+        return -1;
+    } else if (entry_a->base > entry_b->base) {
+        return 1;
+    } else {
+        return 0;
+    }
 }
 
 /*
-- 
MST



^ permalink raw reply related	[flat|nested] 53+ messages in thread

* [Qemu-devel] [PULL 03/12] ioapic: kvm: Skip route updates for masked pins
  2019-06-02 11:42 ` [Qemu-devel] [PATCH] ioapic: kvm: Skip route updates for masked pins Jan Kiszka
  2019-06-02 12:10   ` Peter Xu
  2019-06-03  0:36   ` Michael S. Tsirkin
@ 2019-07-25 15:31   ` Michael S. Tsirkin
  2 siblings, 0 replies; 53+ messages in thread
From: Michael S. Tsirkin @ 2019-07-25 15:31 UTC (permalink / raw)
  To: qemu-devel
  Cc: Peter Maydell, Paolo Bonzini, qemu-stable, Peter Xu, Jan Kiszka

From: Jan Kiszka <jan.kiszka@siemens.com>

Masked entries will not generate interrupt messages, thus do no need to
be routed by KVM. This is a cosmetic cleanup, just avoiding warnings of
the kind

qemu-system-x86_64: vtd_irte_get: detected non-present IRTE (index=0, high=0xff00, low=0x100)

if the masked entry happens to reference a non-present IRTE.

Cc: qemu-stable@nongnu.org
Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
Message-Id: <a84b7e03-f9a8-b577-be27-4d93d1caa1c9@siemens.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Peter Xu <peterx@redhat.com>
---
 hw/intc/ioapic.c | 8 +++++---
 1 file changed, 5 insertions(+), 3 deletions(-)

diff --git a/hw/intc/ioapic.c b/hw/intc/ioapic.c
index c408749876..e99c37cceb 100644
--- a/hw/intc/ioapic.c
+++ b/hw/intc/ioapic.c
@@ -197,9 +197,11 @@ static void ioapic_update_kvm_routes(IOAPICCommonState *s)
             MSIMessage msg;
             struct ioapic_entry_info info;
             ioapic_entry_parse(s->ioredtbl[i], &info);
-            msg.address = info.addr;
-            msg.data = info.data;
-            kvm_irqchip_update_msi_route(kvm_state, i, msg, NULL);
+            if (!info.masked) {
+                msg.address = info.addr;
+                msg.data = info.data;
+                kvm_irqchip_update_msi_route(kvm_state, i, msg, NULL);
+            }
         }
         kvm_irqchip_commit_routes(kvm_state);
     }
-- 
MST



^ permalink raw reply related	[flat|nested] 53+ messages in thread

* [Qemu-devel] [PULL 04/12] i386/acpi: show PCI Express bus on pxb-pcie expanders
  2019-07-19  8:54 ` [Qemu-devel] [PATCH] i386/acpi: show PCI Express bus on pxb-pcie expanders Evgeny Yakovlev
  2019-07-19 12:14   ` Igor Mammedov
@ 2019-07-25 15:31   ` Michael S. Tsirkin
  1 sibling, 0 replies; 53+ messages in thread
From: Michael S. Tsirkin @ 2019-07-25 15:31 UTC (permalink / raw)
  To: qemu-devel
  Cc: Peter Maydell, Eduardo Habkost, qemu-stable, Igor Mammedov,
	Paolo Bonzini, Evgeny Yakovlev, Richard Henderson

From: Evgeny Yakovlev <wrfsh@yandex-team.ru>

Show PCIe host bridge PNP id with PCI host bridge as a compatible id
when expanding a pcie bus.

Cc: qemu-stable@nongnu.org
Signed-off-by: Evgeny Yakovlev <wrfsh@yandex-team.ru>
Message-Id: <1563526469-15588-1-git-send-email-wrfsh@yandex-team.ru>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
---
 hw/i386/acpi-build.c | 5 ++++-
 1 file changed, 4 insertions(+), 1 deletion(-)

diff --git a/hw/i386/acpi-build.c b/hw/i386/acpi-build.c
index e7b756b51b..f3fdfefcd5 100644
--- a/hw/i386/acpi-build.c
+++ b/hw/i386/acpi-build.c
@@ -1914,10 +1914,13 @@ build_dsdt(GArray *table_data, BIOSLinker *linker,
             scope = aml_scope("\\_SB");
             dev = aml_device("PC%.02X", bus_num);
             aml_append(dev, aml_name_decl("_UID", aml_int(bus_num)));
-            aml_append(dev, aml_name_decl("_HID", aml_eisaid("PNP0A03")));
             aml_append(dev, aml_name_decl("_BBN", aml_int(bus_num)));
             if (pci_bus_is_express(bus)) {
+                aml_append(dev, aml_name_decl("_HID", aml_eisaid("PNP0A08")));
+                aml_append(dev, aml_name_decl("_CID", aml_eisaid("PNP0A03")));
                 aml_append(dev, build_q35_osc_method());
+            } else {
+                aml_append(dev, aml_name_decl("_HID", aml_eisaid("PNP0A03")));
             }
 
             if (numa_node != NUMA_NODE_UNASSIGNED) {
-- 
MST



^ permalink raw reply related	[flat|nested] 53+ messages in thread

* [Qemu-devel] [PULL 05/12] virtio-balloon: Fix wrong sign extension of PFNs
  2019-07-22 13:41 ` [Qemu-devel] [PATCH-for-4.1 v3 1/6] virtio-balloon: Fix wrong sign extension of PFNs David Hildenbrand
  2019-07-23  2:27   ` David Gibson
@ 2019-07-25 15:31   ` Michael S. Tsirkin
  1 sibling, 0 replies; 53+ messages in thread
From: Michael S. Tsirkin @ 2019-07-25 15:31 UTC (permalink / raw)
  To: qemu-devel; +Cc: Peter Maydell, David Gibson, qemu-stable, David Hildenbrand

From: David Hildenbrand <david@redhat.com>

If we directly cast from int to uint64_t, we will first sign-extend to
an int64_t, which is wrong. We actually want to treat the PFNs like
unsigned values.

As far as I can see, this dates back to the initial virtio-balloon
commit, but wasn't triggered as fairly big guests would be required.

Cc: qemu-stable@nongnu.org
Reported-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: David Hildenbrand <david@redhat.com>
Message-Id: <20190722134108.22151-2-david@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
---
 hw/virtio/virtio-balloon.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/hw/virtio/virtio-balloon.c b/hw/virtio/virtio-balloon.c
index e85d1c0d5c..515abf6553 100644
--- a/hw/virtio/virtio-balloon.c
+++ b/hw/virtio/virtio-balloon.c
@@ -343,8 +343,8 @@ static void virtio_balloon_handle_output(VirtIODevice *vdev, VirtQueue *vq)
         }
 
         while (iov_to_buf(elem->out_sg, elem->out_num, offset, &pfn, 4) == 4) {
+            unsigned int p = virtio_ldl_p(vdev, &pfn);
             hwaddr pa;
-            int p = virtio_ldl_p(vdev, &pfn);
 
             pa = (hwaddr) p << VIRTIO_BALLOON_PFN_SHIFT;
             offset += 4;
-- 
MST



^ permalink raw reply related	[flat|nested] 53+ messages in thread

* [Qemu-devel] [PULL 06/12] virtio-balloon: Fix QEMU crashes on pagesize > BALLOON_PAGE_SIZE
  2019-07-22 13:41 ` [Qemu-devel] [PATCH-for-4.1 v3 2/6] virtio-balloon: Fix QEMU crashes on pagesize > BALLOON_PAGE_SIZE David Hildenbrand
@ 2019-07-25 15:31   ` Michael S. Tsirkin
  0 siblings, 0 replies; 53+ messages in thread
From: Michael S. Tsirkin @ 2019-07-25 15:31 UTC (permalink / raw)
  To: qemu-devel; +Cc: Peter Maydell, David Gibson, qemu-stable, David Hildenbrand

From: David Hildenbrand <david@redhat.com>

We are using the wrong functions to set/clear bits, effectively touching
multiple bits, writing out of range of the bitmap, resulting in memory
corruptions. We have to use set_bit()/clear_bit() instead.

Can easily be reproduced by starting a qemu guest on hugetlbfs memory,
inflating the balloon. QEMU crashes. This never could have worked
properly - especially, also pages would have been discarded when the
first sub-page would be inflated (the whole bitmap would be set).

While testing I realized, that on hugetlbfs it is pretty much impossible
to discard a page - the guest just frees the 4k sub-pages in random order
most of the time. I was only able to discard a hugepage a handful of
times - so I hope that now works correctly.

Fixes: ed48c59875b6 ("virtio-balloon: Safely handle BALLOON_PAGE_SIZE < host page size")
Fixes: b27b32391404 ("virtio-balloon: Fix possible guest memory corruption with inflates & deflates")
Cc: qemu-stable@nongnu.org #v4.0.0
Acked-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: David Hildenbrand <david@redhat.com>
Message-Id: <20190722134108.22151-3-david@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
---
 hw/virtio/virtio-balloon.c | 10 ++++------
 1 file changed, 4 insertions(+), 6 deletions(-)

diff --git a/hw/virtio/virtio-balloon.c b/hw/virtio/virtio-balloon.c
index 515abf6553..a78d2d2184 100644
--- a/hw/virtio/virtio-balloon.c
+++ b/hw/virtio/virtio-balloon.c
@@ -94,9 +94,8 @@ static void balloon_inflate_page(VirtIOBalloon *balloon,
         balloon->pbp->base = host_page_base;
     }
 
-    bitmap_set(balloon->pbp->bitmap,
-               (ram_offset - balloon->pbp->base) / BALLOON_PAGE_SIZE,
-               subpages);
+    set_bit((ram_offset - balloon->pbp->base) / BALLOON_PAGE_SIZE,
+            balloon->pbp->bitmap);
 
     if (bitmap_full(balloon->pbp->bitmap, subpages)) {
         /* We've accumulated a full host page, we can actually discard
@@ -140,9 +139,8 @@ static void balloon_deflate_page(VirtIOBalloon *balloon,
          * for a guest to do this in practice, but handle it anyway,
          * since getting it wrong could mean discarding memory the
          * guest is still using. */
-        bitmap_clear(balloon->pbp->bitmap,
-                     (ram_offset - balloon->pbp->base) / BALLOON_PAGE_SIZE,
-                     subpages);
+        clear_bit((ram_offset - balloon->pbp->base) / BALLOON_PAGE_SIZE,
+                  balloon->pbp->bitmap);
 
         if (bitmap_empty(balloon->pbp->bitmap, subpages)) {
             g_free(balloon->pbp);
-- 
MST



^ permalink raw reply related	[flat|nested] 53+ messages in thread

* [Qemu-devel] [PULL 07/12] virtio-balloon: Simplify deflate with pbp
  2019-07-22 13:41 ` [Qemu-devel] [PATCH-for-4.1 v3 3/6] virtio-balloon: Simplify deflate with pbp David Hildenbrand
@ 2019-07-25 15:32   ` Michael S. Tsirkin
  0 siblings, 0 replies; 53+ messages in thread
From: Michael S. Tsirkin @ 2019-07-25 15:32 UTC (permalink / raw)
  To: qemu-devel; +Cc: Peter Maydell, David Gibson, David Hildenbrand

From: David Hildenbrand <david@redhat.com>

Let's simplify this - the case we are optimizing for is very hard to
trigger and not worth the effort. If we're switching from inflation to
deflation, let's reset the pbp.

Acked-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: David Hildenbrand <david@redhat.com>
Message-Id: <20190722134108.22151-4-david@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
---
 hw/virtio/virtio-balloon.c | 26 +++++---------------------
 1 file changed, 5 insertions(+), 21 deletions(-)

diff --git a/hw/virtio/virtio-balloon.c b/hw/virtio/virtio-balloon.c
index a78d2d2184..04a7e6c772 100644
--- a/hw/virtio/virtio-balloon.c
+++ b/hw/virtio/virtio-balloon.c
@@ -117,7 +117,7 @@ static void balloon_deflate_page(VirtIOBalloon *balloon,
     void *addr = memory_region_get_ram_ptr(mr) + offset;
     RAMBlock *rb;
     size_t rb_page_size;
-    ram_addr_t ram_offset, host_page_base;
+    ram_addr_t ram_offset;
     void *host_addr;
     int ret;
 
@@ -125,27 +125,11 @@ static void balloon_deflate_page(VirtIOBalloon *balloon,
      * host address? */
     rb = qemu_ram_block_from_host(addr, false, &ram_offset);
     rb_page_size = qemu_ram_pagesize(rb);
-    host_page_base = ram_offset & ~(rb_page_size - 1);
-
-    if (balloon->pbp
-        && rb == balloon->pbp->rb
-        && host_page_base == balloon->pbp->base) {
-        int subpages = rb_page_size / BALLOON_PAGE_SIZE;
 
-        /*
-         * This means the guest has asked to discard some of the 4kiB
-         * subpages of a host page, but then changed its mind and
-         * asked to keep them after all.  It's exceedingly unlikely
-         * for a guest to do this in practice, but handle it anyway,
-         * since getting it wrong could mean discarding memory the
-         * guest is still using. */
-        clear_bit((ram_offset - balloon->pbp->base) / BALLOON_PAGE_SIZE,
-                  balloon->pbp->bitmap);
-
-        if (bitmap_empty(balloon->pbp->bitmap, subpages)) {
-            g_free(balloon->pbp);
-            balloon->pbp = NULL;
-        }
+    if (balloon->pbp) {
+        /* Let's play safe and always reset the pbp on deflation requests. */
+        g_free(balloon->pbp);
+        balloon->pbp = NULL;
     }
 
     host_addr = (void *)((uintptr_t)addr & ~(rb_page_size - 1));
-- 
MST



^ permalink raw reply related	[flat|nested] 53+ messages in thread

* [Qemu-devel] [PULL 08/12] virtio-balloon: Better names for offset variables in inflate/deflate code
  2019-07-22 13:41 ` [Qemu-devel] [PATCH-for-4.1 v3 4/6] virtio-balloon: Better names for offset variables in inflate/deflate code David Hildenbrand
@ 2019-07-25 15:32   ` Michael S. Tsirkin
  0 siblings, 0 replies; 53+ messages in thread
From: Michael S. Tsirkin @ 2019-07-25 15:32 UTC (permalink / raw)
  To: qemu-devel; +Cc: Peter Maydell, David Gibson, David Hildenbrand

From: David Hildenbrand <david@redhat.com>

"host_page_base" is really confusing, let's make this clearer, also
rename the other offsets to indicate to which base they apply.

offset -> mr_offset
ram_offset -> rb_offset
host_page_base -> rb_aligned_offset

While at it, use QEMU_ALIGN_DOWN() instead of a handcrafted computation
and move the computation to the place where it is needed.

Acked-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: David Hildenbrand <david@redhat.com>
Message-Id: <20190722134108.22151-5-david@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
---
 hw/virtio/virtio-balloon.c | 26 +++++++++++++-------------
 1 file changed, 13 insertions(+), 13 deletions(-)

diff --git a/hw/virtio/virtio-balloon.c b/hw/virtio/virtio-balloon.c
index 04a7e6c772..f206cc8bf7 100644
--- a/hw/virtio/virtio-balloon.c
+++ b/hw/virtio/virtio-balloon.c
@@ -41,24 +41,23 @@ struct PartiallyBalloonedPage {
 };
 
 static void balloon_inflate_page(VirtIOBalloon *balloon,
-                                 MemoryRegion *mr, hwaddr offset)
+                                 MemoryRegion *mr, hwaddr mr_offset)
 {
-    void *addr = memory_region_get_ram_ptr(mr) + offset;
+    void *addr = memory_region_get_ram_ptr(mr) + mr_offset;
+    ram_addr_t rb_offset, rb_aligned_offset;
     RAMBlock *rb;
     size_t rb_page_size;
     int subpages;
-    ram_addr_t ram_offset, host_page_base;
 
     /* XXX is there a better way to get to the RAMBlock than via a
      * host address? */
-    rb = qemu_ram_block_from_host(addr, false, &ram_offset);
+    rb = qemu_ram_block_from_host(addr, false, &rb_offset);
     rb_page_size = qemu_ram_pagesize(rb);
-    host_page_base = ram_offset & ~(rb_page_size - 1);
 
     if (rb_page_size == BALLOON_PAGE_SIZE) {
         /* Easy case */
 
-        ram_block_discard_range(rb, ram_offset, rb_page_size);
+        ram_block_discard_range(rb, rb_offset, rb_page_size);
         /* We ignore errors from ram_block_discard_range(), because it
          * has already reported them, and failing to discard a balloon
          * page is not fatal */
@@ -74,11 +73,12 @@ static void balloon_inflate_page(VirtIOBalloon *balloon,
     warn_report_once(
 "Balloon used with backing page size > 4kiB, this may not be reliable");
 
+    rb_aligned_offset = QEMU_ALIGN_DOWN(rb_offset, rb_page_size);
     subpages = rb_page_size / BALLOON_PAGE_SIZE;
 
     if (balloon->pbp
         && (rb != balloon->pbp->rb
-            || host_page_base != balloon->pbp->base)) {
+            || rb_aligned_offset != balloon->pbp->base)) {
         /* We've partially ballooned part of a host page, but now
          * we're trying to balloon part of a different one.  Too hard,
          * give up on the old partial page */
@@ -91,10 +91,10 @@ static void balloon_inflate_page(VirtIOBalloon *balloon,
         size_t bitlen = BITS_TO_LONGS(subpages) * sizeof(unsigned long);
         balloon->pbp = g_malloc0(sizeof(PartiallyBalloonedPage) + bitlen);
         balloon->pbp->rb = rb;
-        balloon->pbp->base = host_page_base;
+        balloon->pbp->base = rb_aligned_offset;
     }
 
-    set_bit((ram_offset - balloon->pbp->base) / BALLOON_PAGE_SIZE,
+    set_bit((rb_offset - balloon->pbp->base) / BALLOON_PAGE_SIZE,
             balloon->pbp->bitmap);
 
     if (bitmap_full(balloon->pbp->bitmap, subpages)) {
@@ -112,18 +112,18 @@ static void balloon_inflate_page(VirtIOBalloon *balloon,
 }
 
 static void balloon_deflate_page(VirtIOBalloon *balloon,
-                                 MemoryRegion *mr, hwaddr offset)
+                                 MemoryRegion *mr, hwaddr mr_offset)
 {
-    void *addr = memory_region_get_ram_ptr(mr) + offset;
+    void *addr = memory_region_get_ram_ptr(mr) + mr_offset;
+    ram_addr_t rb_offset;
     RAMBlock *rb;
     size_t rb_page_size;
-    ram_addr_t ram_offset;
     void *host_addr;
     int ret;
 
     /* XXX is there a better way to get to the RAMBlock than via a
      * host address? */
-    rb = qemu_ram_block_from_host(addr, false, &ram_offset);
+    rb = qemu_ram_block_from_host(addr, false, &rb_offset);
     rb_page_size = qemu_ram_pagesize(rb);
 
     if (balloon->pbp) {
-- 
MST



^ permalink raw reply related	[flat|nested] 53+ messages in thread

* [Qemu-devel] [PULL 09/12] virtio-balloon: Rework pbp tracking data
  2019-07-22 13:41 ` [Qemu-devel] [PATCH-for-4.1 v3 5/6] virtio-balloon: Rework pbp tracking data David Hildenbrand
  2019-07-23  2:54   ` David Gibson
@ 2019-07-25 15:32   ` Michael S. Tsirkin
  1 sibling, 0 replies; 53+ messages in thread
From: Michael S. Tsirkin @ 2019-07-25 15:32 UTC (permalink / raw)
  To: qemu-devel; +Cc: Peter Maydell, qemu-stable, David Hildenbrand

From: David Hildenbrand <david@redhat.com>

Using the address of a RAMBlock to test for a matching pbp is not really
safe. Instead, let's use the guest physical address of the base page
along with the page size (via the number of subpages).

Also, let's allocate the bitmap separately. This makes the code
easier to read and maintain - we can reuse bitmap_new().

Prepare the code to move the PBP out of the device.

Fixes: ed48c59875b6 ("virtio-balloon: Safely handle BALLOON_PAGE_SIZE < host page size")
Fixes: b27b32391404 ("virtio-balloon: Fix possible guest memory corruption with inflates & deflates")
Cc: qemu-stable@nongnu.org #v4.0.0
Signed-off-by: David Hildenbrand <david@redhat.com>
Message-Id: <20190722134108.22151-6-david@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
---
 hw/virtio/virtio-balloon.c | 69 +++++++++++++++++++++++++-------------
 1 file changed, 46 insertions(+), 23 deletions(-)

diff --git a/hw/virtio/virtio-balloon.c b/hw/virtio/virtio-balloon.c
index f206cc8bf7..40d493a31a 100644
--- a/hw/virtio/virtio-balloon.c
+++ b/hw/virtio/virtio-balloon.c
@@ -35,16 +35,44 @@
 #define BALLOON_PAGE_SIZE  (1 << VIRTIO_BALLOON_PFN_SHIFT)
 
 struct PartiallyBalloonedPage {
-    RAMBlock *rb;
-    ram_addr_t base;
-    unsigned long bitmap[];
+    ram_addr_t base_gpa;
+    long subpages;
+    unsigned long *bitmap;
 };
 
+static void virtio_balloon_pbp_free(PartiallyBalloonedPage *pbp)
+{
+    if (!pbp) {
+        return;
+    }
+    g_free(pbp->bitmap);
+    g_free(pbp);
+}
+
+static PartiallyBalloonedPage *virtio_balloon_pbp_alloc(ram_addr_t base_gpa,
+                                                        long subpages)
+{
+    PartiallyBalloonedPage *pbp = g_new0(PartiallyBalloonedPage, 1);
+
+    pbp->base_gpa = base_gpa;
+    pbp->subpages = subpages;
+    pbp->bitmap = bitmap_new(subpages);
+
+    return pbp;
+}
+
+static bool virtio_balloon_pbp_matches(PartiallyBalloonedPage *pbp,
+                                       ram_addr_t base_gpa, long subpages)
+{
+    return pbp->subpages == subpages && pbp->base_gpa == base_gpa;
+}
+
 static void balloon_inflate_page(VirtIOBalloon *balloon,
                                  MemoryRegion *mr, hwaddr mr_offset)
 {
     void *addr = memory_region_get_ram_ptr(mr) + mr_offset;
-    ram_addr_t rb_offset, rb_aligned_offset;
+    ram_addr_t rb_offset, rb_aligned_offset, base_gpa;
+    PartiallyBalloonedPage **pbp = &balloon->pbp;
     RAMBlock *rb;
     size_t rb_page_size;
     int subpages;
@@ -75,39 +103,34 @@ static void balloon_inflate_page(VirtIOBalloon *balloon,
 
     rb_aligned_offset = QEMU_ALIGN_DOWN(rb_offset, rb_page_size);
     subpages = rb_page_size / BALLOON_PAGE_SIZE;
+    base_gpa = memory_region_get_ram_addr(mr) + mr_offset -
+               (rb_offset - rb_aligned_offset);
 
-    if (balloon->pbp
-        && (rb != balloon->pbp->rb
-            || rb_aligned_offset != balloon->pbp->base)) {
+    if (*pbp && !virtio_balloon_pbp_matches(*pbp, base_gpa, subpages)) {
         /* We've partially ballooned part of a host page, but now
          * we're trying to balloon part of a different one.  Too hard,
          * give up on the old partial page */
-        g_free(balloon->pbp);
-        balloon->pbp = NULL;
+        virtio_balloon_pbp_free(*pbp);
+        *pbp = NULL;
     }
 
-    if (!balloon->pbp) {
-        /* Starting on a new host page */
-        size_t bitlen = BITS_TO_LONGS(subpages) * sizeof(unsigned long);
-        balloon->pbp = g_malloc0(sizeof(PartiallyBalloonedPage) + bitlen);
-        balloon->pbp->rb = rb;
-        balloon->pbp->base = rb_aligned_offset;
+    if (!*pbp) {
+        *pbp = virtio_balloon_pbp_alloc(base_gpa, subpages);
     }
 
-    set_bit((rb_offset - balloon->pbp->base) / BALLOON_PAGE_SIZE,
-            balloon->pbp->bitmap);
+    set_bit((rb_offset - rb_aligned_offset) / BALLOON_PAGE_SIZE,
+            (*pbp)->bitmap);
 
-    if (bitmap_full(balloon->pbp->bitmap, subpages)) {
+    if (bitmap_full((*pbp)->bitmap, subpages)) {
         /* We've accumulated a full host page, we can actually discard
          * it now */
 
-        ram_block_discard_range(rb, balloon->pbp->base, rb_page_size);
+        ram_block_discard_range(rb, rb_aligned_offset, rb_page_size);
         /* We ignore errors from ram_block_discard_range(), because it
          * has already reported them, and failing to discard a balloon
          * page is not fatal */
-
-        g_free(balloon->pbp);
-        balloon->pbp = NULL;
+        virtio_balloon_pbp_free(*pbp);
+        *pbp = NULL;
     }
 }
 
@@ -128,7 +151,7 @@ static void balloon_deflate_page(VirtIOBalloon *balloon,
 
     if (balloon->pbp) {
         /* Let's play safe and always reset the pbp on deflation requests. */
-        g_free(balloon->pbp);
+        virtio_balloon_pbp_free(balloon->pbp);
         balloon->pbp = NULL;
     }
 
-- 
MST



^ permalink raw reply related	[flat|nested] 53+ messages in thread

* [Qemu-devel] [PULL 10/12] virtio-balloon: Use temporary PBP only
  2019-07-22 13:41 ` [Qemu-devel] [PATCH-for-4.1 v3 6/6] virtio-balloon: Use temporary PBP only David Hildenbrand
  2019-07-23  3:22   ` David Gibson
@ 2019-07-25 15:32   ` Michael S. Tsirkin
  1 sibling, 0 replies; 53+ messages in thread
From: Michael S. Tsirkin @ 2019-07-25 15:32 UTC (permalink / raw)
  To: qemu-devel; +Cc: Peter Maydell, David Gibson, qemu-stable, David Hildenbrand

From: David Hildenbrand <david@redhat.com>

We still have multiple issues in the current code
- The PBP is not freed during unrealize()
- The PBP is not reset on device resets: After a reset, the PBP is stale.
- We are not indicating VIRTIO_BALLOON_F_MUST_TELL_HOST, therefore
  guests (esp. legacy guests) will reuse pages without deflating,
  turning the PBP stale. Adding that would require compat handling.

Instead, let's use the PBP only temporarily, when processing one bulk of
inflation requests. This will keep guest_page_size > 4k working (with
Linux guests). There is nothing to do for deflation requests anymore.
The pbp is only used for a limited amount of time.

Fixes: ed48c59875b6 ("virtio-balloon: Safely handle BALLOON_PAGE_SIZE < host page size")
Cc: qemu-stable@nongnu.org #v4.0.0
Suggested-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: David Hildenbrand <david@redhat.com>
Message-Id: <20190722134108.22151-7-david@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Acked-by: David Gibson <david@gibson.dropbear.id.au>
---
 hw/virtio/virtio-balloon.c         | 21 +++++++++------------
 include/hw/virtio/virtio-balloon.h |  3 ---
 2 files changed, 9 insertions(+), 15 deletions(-)

diff --git a/hw/virtio/virtio-balloon.c b/hw/virtio/virtio-balloon.c
index 40d493a31a..a6282d58d4 100644
--- a/hw/virtio/virtio-balloon.c
+++ b/hw/virtio/virtio-balloon.c
@@ -34,11 +34,11 @@
 
 #define BALLOON_PAGE_SIZE  (1 << VIRTIO_BALLOON_PFN_SHIFT)
 
-struct PartiallyBalloonedPage {
+typedef struct PartiallyBalloonedPage {
     ram_addr_t base_gpa;
     long subpages;
     unsigned long *bitmap;
-};
+} PartiallyBalloonedPage;
 
 static void virtio_balloon_pbp_free(PartiallyBalloonedPage *pbp)
 {
@@ -68,11 +68,11 @@ static bool virtio_balloon_pbp_matches(PartiallyBalloonedPage *pbp,
 }
 
 static void balloon_inflate_page(VirtIOBalloon *balloon,
-                                 MemoryRegion *mr, hwaddr mr_offset)
+                                 MemoryRegion *mr, hwaddr mr_offset,
+                                 PartiallyBalloonedPage **pbp)
 {
     void *addr = memory_region_get_ram_ptr(mr) + mr_offset;
     ram_addr_t rb_offset, rb_aligned_offset, base_gpa;
-    PartiallyBalloonedPage **pbp = &balloon->pbp;
     RAMBlock *rb;
     size_t rb_page_size;
     int subpages;
@@ -149,12 +149,6 @@ static void balloon_deflate_page(VirtIOBalloon *balloon,
     rb = qemu_ram_block_from_host(addr, false, &rb_offset);
     rb_page_size = qemu_ram_pagesize(rb);
 
-    if (balloon->pbp) {
-        /* Let's play safe and always reset the pbp on deflation requests. */
-        virtio_balloon_pbp_free(balloon->pbp);
-        balloon->pbp = NULL;
-    }
-
     host_addr = (void *)((uintptr_t)addr & ~(rb_page_size - 1));
 
     /* When a page is deflated, we hint the whole host page it lives
@@ -336,6 +330,7 @@ static void balloon_stats_set_poll_interval(Object *obj, Visitor *v,
 static void virtio_balloon_handle_output(VirtIODevice *vdev, VirtQueue *vq)
 {
     VirtIOBalloon *s = VIRTIO_BALLOON(vdev);
+    PartiallyBalloonedPage *pbp = NULL;
     VirtQueueElement *elem;
     MemoryRegionSection section;
 
@@ -344,7 +339,7 @@ static void virtio_balloon_handle_output(VirtIODevice *vdev, VirtQueue *vq)
         uint32_t pfn;
         elem = virtqueue_pop(vq, sizeof(VirtQueueElement));
         if (!elem) {
-            return;
+            break;
         }
 
         while (iov_to_buf(elem->out_sg, elem->out_num, offset, &pfn, 4) == 4) {
@@ -373,7 +368,7 @@ static void virtio_balloon_handle_output(VirtIODevice *vdev, VirtQueue *vq)
             if (!qemu_balloon_is_inhibited()) {
                 if (vq == s->ivq) {
                     balloon_inflate_page(s, section.mr,
-                                         section.offset_within_region);
+                                         section.offset_within_region, &pbp);
                 } else if (vq == s->dvq) {
                     balloon_deflate_page(s, section.mr, section.offset_within_region);
                 } else {
@@ -387,6 +382,8 @@ static void virtio_balloon_handle_output(VirtIODevice *vdev, VirtQueue *vq)
         virtio_notify(vdev, vq);
         g_free(elem);
     }
+
+    virtio_balloon_pbp_free(pbp);
 }
 
 static void virtio_balloon_receive_stats(VirtIODevice *vdev, VirtQueue *vq)
diff --git a/include/hw/virtio/virtio-balloon.h b/include/hw/virtio/virtio-balloon.h
index 5a99293a45..d1c968d237 100644
--- a/include/hw/virtio/virtio-balloon.h
+++ b/include/hw/virtio/virtio-balloon.h
@@ -33,8 +33,6 @@ typedef struct virtio_balloon_stat_modern {
        uint64_t val;
 } VirtIOBalloonStatModern;
 
-typedef struct PartiallyBalloonedPage PartiallyBalloonedPage;
-
 enum virtio_balloon_free_page_report_status {
     FREE_PAGE_REPORT_S_STOP = 0,
     FREE_PAGE_REPORT_S_REQUESTED = 1,
@@ -70,7 +68,6 @@ typedef struct VirtIOBalloon {
     int64_t stats_last_update;
     int64_t stats_poll_interval;
     uint32_t host_features;
-    PartiallyBalloonedPage *pbp;
 
     bool qemu_4_0_config_size;
 } VirtIOBalloon;
-- 
MST



^ permalink raw reply related	[flat|nested] 53+ messages in thread

* [Qemu-devel] [PULL 11/12] virtio-balloon: don't track subpages for the PBP
  2019-07-25 11:36 ` [Qemu-devel] [PATCH-for-4.1 v4 7/7] virtio-balloon: No need to track subpages for the PBP anymore David Hildenbrand
@ 2019-07-25 15:32   ` Michael S. Tsirkin
  2019-07-26  8:10   ` [Qemu-devel] [PATCH-for-4.1 v4 7/7] virtio-balloon: No need to track subpages for the PBP anymore David Gibson
  1 sibling, 0 replies; 53+ messages in thread
From: Michael S. Tsirkin @ 2019-07-25 15:32 UTC (permalink / raw)
  To: qemu-devel; +Cc: Peter Maydell, David Gibson, David Hildenbrand

From: David Hildenbrand <david@redhat.com>

As ramblocks cannot get removed/readded while we are processing a bulk
of inflation requests, there is no more need to track the page size
in form of the number of subpages.

Suggested-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: David Hildenbrand <david@redhat.com>
Message-Id: <20190725113638.4702-8-david@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
---
 hw/virtio/virtio-balloon.c | 8 +++-----
 1 file changed, 3 insertions(+), 5 deletions(-)

diff --git a/hw/virtio/virtio-balloon.c b/hw/virtio/virtio-balloon.c
index a6282d58d4..fe9664e42c 100644
--- a/hw/virtio/virtio-balloon.c
+++ b/hw/virtio/virtio-balloon.c
@@ -36,7 +36,6 @@
 
 typedef struct PartiallyBalloonedPage {
     ram_addr_t base_gpa;
-    long subpages;
     unsigned long *bitmap;
 } PartiallyBalloonedPage;
 
@@ -55,16 +54,15 @@ static PartiallyBalloonedPage *virtio_balloon_pbp_alloc(ram_addr_t base_gpa,
     PartiallyBalloonedPage *pbp = g_new0(PartiallyBalloonedPage, 1);
 
     pbp->base_gpa = base_gpa;
-    pbp->subpages = subpages;
     pbp->bitmap = bitmap_new(subpages);
 
     return pbp;
 }
 
 static bool virtio_balloon_pbp_matches(PartiallyBalloonedPage *pbp,
-                                       ram_addr_t base_gpa, long subpages)
+                                       ram_addr_t base_gpa)
 {
-    return pbp->subpages == subpages && pbp->base_gpa == base_gpa;
+    return pbp->base_gpa == base_gpa;
 }
 
 static void balloon_inflate_page(VirtIOBalloon *balloon,
@@ -106,7 +104,7 @@ static void balloon_inflate_page(VirtIOBalloon *balloon,
     base_gpa = memory_region_get_ram_addr(mr) + mr_offset -
                (rb_offset - rb_aligned_offset);
 
-    if (*pbp && !virtio_balloon_pbp_matches(*pbp, base_gpa, subpages)) {
+    if (*pbp && !virtio_balloon_pbp_matches(*pbp, base_gpa)) {
         /* We've partially ballooned part of a host page, but now
          * we're trying to balloon part of a different one.  Too hard,
          * give up on the old partial page */
-- 
MST



^ permalink raw reply related	[flat|nested] 53+ messages in thread

* [Qemu-devel] [PULL 12/12] virtio-balloon: free pbp more aggressively
  2019-07-25 15:31 [Qemu-devel] [PULL 00/12] virtio, pc: fixes, cleanups Michael S. Tsirkin
                   ` (3 preceding siblings ...)
  2019-07-19  8:54 ` [Qemu-devel] [PATCH] i386/acpi: show PCI Express bus on pxb-pcie expanders Evgeny Yakovlev
@ 2019-07-25 15:32 ` Michael S. Tsirkin
  2019-07-26  9:53 ` [Qemu-devel] [PULL 00/12] virtio, pc: fixes, cleanups Peter Maydell
  5 siblings, 0 replies; 53+ messages in thread
From: Michael S. Tsirkin @ 2019-07-25 15:32 UTC (permalink / raw)
  To: qemu-devel; +Cc: Peter Maydell, qemu-stable, David Hildenbrand

Previous patches switched to a temporary pbp but that does not go far
enough: after device uses a buffer, guest is free to reuse it, so
tracking the page and freeing it later is wrong.

Free and reset the pbp after we push each element.

Fixes: ed48c59875b6 ("virtio-balloon: Safely handle BALLOON_PAGE_SIZE < host page size")
Cc: qemu-stable@nongnu.org #v4.0.0
Cc: David Hildenbrand <david@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
---
 hw/virtio/virtio-balloon.c | 37 ++++++++++++++++---------------------
 1 file changed, 16 insertions(+), 21 deletions(-)

diff --git a/hw/virtio/virtio-balloon.c b/hw/virtio/virtio-balloon.c
index fe9664e42c..25de154307 100644
--- a/hw/virtio/virtio-balloon.c
+++ b/hw/virtio/virtio-balloon.c
@@ -41,22 +41,19 @@ typedef struct PartiallyBalloonedPage {
 
 static void virtio_balloon_pbp_free(PartiallyBalloonedPage *pbp)
 {
-    if (!pbp) {
+    if (!pbp->bitmap) {
         return;
     }
     g_free(pbp->bitmap);
-    g_free(pbp);
+    pbp->bitmap = NULL;
 }
 
-static PartiallyBalloonedPage *virtio_balloon_pbp_alloc(ram_addr_t base_gpa,
-                                                        long subpages)
+static void virtio_balloon_pbp_alloc(PartiallyBalloonedPage *pbp,
+                                     ram_addr_t base_gpa,
+                                     long subpages)
 {
-    PartiallyBalloonedPage *pbp = g_new0(PartiallyBalloonedPage, 1);
-
     pbp->base_gpa = base_gpa;
     pbp->bitmap = bitmap_new(subpages);
-
-    return pbp;
 }
 
 static bool virtio_balloon_pbp_matches(PartiallyBalloonedPage *pbp,
@@ -67,7 +64,7 @@ static bool virtio_balloon_pbp_matches(PartiallyBalloonedPage *pbp,
 
 static void balloon_inflate_page(VirtIOBalloon *balloon,
                                  MemoryRegion *mr, hwaddr mr_offset,
-                                 PartiallyBalloonedPage **pbp)
+                                 PartiallyBalloonedPage *pbp)
 {
     void *addr = memory_region_get_ram_ptr(mr) + mr_offset;
     ram_addr_t rb_offset, rb_aligned_offset, base_gpa;
@@ -104,22 +101,21 @@ static void balloon_inflate_page(VirtIOBalloon *balloon,
     base_gpa = memory_region_get_ram_addr(mr) + mr_offset -
                (rb_offset - rb_aligned_offset);
 
-    if (*pbp && !virtio_balloon_pbp_matches(*pbp, base_gpa)) {
+    if (pbp->bitmap && !virtio_balloon_pbp_matches(pbp, base_gpa)) {
         /* We've partially ballooned part of a host page, but now
          * we're trying to balloon part of a different one.  Too hard,
          * give up on the old partial page */
-        virtio_balloon_pbp_free(*pbp);
-        *pbp = NULL;
+        virtio_balloon_pbp_free(pbp);
     }
 
-    if (!*pbp) {
-        *pbp = virtio_balloon_pbp_alloc(base_gpa, subpages);
+    if (!pbp->bitmap) {
+        virtio_balloon_pbp_alloc(pbp, base_gpa, subpages);
     }
 
     set_bit((rb_offset - rb_aligned_offset) / BALLOON_PAGE_SIZE,
-            (*pbp)->bitmap);
+            pbp->bitmap);
 
-    if (bitmap_full((*pbp)->bitmap, subpages)) {
+    if (bitmap_full(pbp->bitmap, subpages)) {
         /* We've accumulated a full host page, we can actually discard
          * it now */
 
@@ -127,8 +123,7 @@ static void balloon_inflate_page(VirtIOBalloon *balloon,
         /* We ignore errors from ram_block_discard_range(), because it
          * has already reported them, and failing to discard a balloon
          * page is not fatal */
-        virtio_balloon_pbp_free(*pbp);
-        *pbp = NULL;
+        virtio_balloon_pbp_free(pbp);
     }
 }
 
@@ -328,13 +323,14 @@ static void balloon_stats_set_poll_interval(Object *obj, Visitor *v,
 static void virtio_balloon_handle_output(VirtIODevice *vdev, VirtQueue *vq)
 {
     VirtIOBalloon *s = VIRTIO_BALLOON(vdev);
-    PartiallyBalloonedPage *pbp = NULL;
     VirtQueueElement *elem;
     MemoryRegionSection section;
 
     for (;;) {
+        PartiallyBalloonedPage pbp = {};
         size_t offset = 0;
         uint32_t pfn;
+
         elem = virtqueue_pop(vq, sizeof(VirtQueueElement));
         if (!elem) {
             break;
@@ -379,9 +375,8 @@ static void virtio_balloon_handle_output(VirtIODevice *vdev, VirtQueue *vq)
         virtqueue_push(vq, elem, offset);
         virtio_notify(vdev, vq);
         g_free(elem);
+        virtio_balloon_pbp_free(&pbp);
     }
-
-    virtio_balloon_pbp_free(pbp);
 }
 
 static void virtio_balloon_receive_stats(VirtIODevice *vdev, VirtQueue *vq)
-- 
MST



^ permalink raw reply related	[flat|nested] 53+ messages in thread

* Re: [Qemu-devel] [PATCH-for-4.1 v4 5/7] virtio-balloon: Rework pbp tracking data
  2019-07-25 11:36 ` [Qemu-devel] [PATCH-for-4.1 v4 5/7] virtio-balloon: Rework pbp tracking data David Hildenbrand
@ 2019-07-26  8:08   ` David Gibson
  0 siblings, 0 replies; 53+ messages in thread
From: David Gibson @ 2019-07-26  8:08 UTC (permalink / raw)
  To: David Hildenbrand
  Cc: Igor Mammedov, qemu-stable, qemu-devel, Stefan Hajnoczi,
	Michael S . Tsirkin

[-- Attachment #1: Type: text/plain, Size: 5494 bytes --]

On Thu, Jul 25, 2019 at 01:36:36PM +0200, David Hildenbrand wrote:
> Using the address of a RAMBlock to test for a matching pbp is not really
> safe. Instead, let's use the guest physical address of the base page
> along with the page size (via the number of subpages).
> 
> Also, let's allocate the bitmap separately. This makes the code
> easier to read and maintain - we can reuse bitmap_new().
> 
> Prepare the code to move the PBP out of the device.
> 
> Fixes: ed48c59875b6 ("virtio-balloon: Safely handle BALLOON_PAGE_SIZE <
>                      host page size")
> Fixes: b27b32391404 ("virtio-balloon: Fix possible guest memory corruption
>                      with inflates & deflates")
> Cc: qemu-stable@nongnu.org #v4.0.0
> Signed-off-by: David Hildenbrand <david@redhat.com>

Reviewed-by: David Gibson <david@gibson.dropbear.id.au>

> ---
>  hw/virtio/virtio-balloon.c | 69 +++++++++++++++++++++++++-------------
>  1 file changed, 46 insertions(+), 23 deletions(-)
> 
> diff --git a/hw/virtio/virtio-balloon.c b/hw/virtio/virtio-balloon.c
> index f206cc8bf7..40d493a31a 100644
> --- a/hw/virtio/virtio-balloon.c
> +++ b/hw/virtio/virtio-balloon.c
> @@ -35,16 +35,44 @@
>  #define BALLOON_PAGE_SIZE  (1 << VIRTIO_BALLOON_PFN_SHIFT)
>  
>  struct PartiallyBalloonedPage {
> -    RAMBlock *rb;
> -    ram_addr_t base;
> -    unsigned long bitmap[];
> +    ram_addr_t base_gpa;
> +    long subpages;
> +    unsigned long *bitmap;
>  };
>  
> +static void virtio_balloon_pbp_free(PartiallyBalloonedPage *pbp)
> +{
> +    if (!pbp) {
> +        return;
> +    }
> +    g_free(pbp->bitmap);
> +    g_free(pbp);
> +}
> +
> +static PartiallyBalloonedPage *virtio_balloon_pbp_alloc(ram_addr_t base_gpa,
> +                                                        long subpages)
> +{
> +    PartiallyBalloonedPage *pbp = g_new0(PartiallyBalloonedPage, 1);
> +
> +    pbp->base_gpa = base_gpa;
> +    pbp->subpages = subpages;
> +    pbp->bitmap = bitmap_new(subpages);
> +
> +    return pbp;
> +}
> +
> +static bool virtio_balloon_pbp_matches(PartiallyBalloonedPage *pbp,
> +                                       ram_addr_t base_gpa, long subpages)
> +{
> +    return pbp->subpages == subpages && pbp->base_gpa == base_gpa;
> +}
> +
>  static void balloon_inflate_page(VirtIOBalloon *balloon,
>                                   MemoryRegion *mr, hwaddr mr_offset)
>  {
>      void *addr = memory_region_get_ram_ptr(mr) + mr_offset;
> -    ram_addr_t rb_offset, rb_aligned_offset;
> +    ram_addr_t rb_offset, rb_aligned_offset, base_gpa;
> +    PartiallyBalloonedPage **pbp = &balloon->pbp;
>      RAMBlock *rb;
>      size_t rb_page_size;
>      int subpages;
> @@ -75,39 +103,34 @@ static void balloon_inflate_page(VirtIOBalloon *balloon,
>  
>      rb_aligned_offset = QEMU_ALIGN_DOWN(rb_offset, rb_page_size);
>      subpages = rb_page_size / BALLOON_PAGE_SIZE;
> +    base_gpa = memory_region_get_ram_addr(mr) + mr_offset -
> +               (rb_offset - rb_aligned_offset);
>  
> -    if (balloon->pbp
> -        && (rb != balloon->pbp->rb
> -            || rb_aligned_offset != balloon->pbp->base)) {
> +    if (*pbp && !virtio_balloon_pbp_matches(*pbp, base_gpa, subpages)) {
>          /* We've partially ballooned part of a host page, but now
>           * we're trying to balloon part of a different one.  Too hard,
>           * give up on the old partial page */
> -        g_free(balloon->pbp);
> -        balloon->pbp = NULL;
> +        virtio_balloon_pbp_free(*pbp);
> +        *pbp = NULL;
>      }
>  
> -    if (!balloon->pbp) {
> -        /* Starting on a new host page */
> -        size_t bitlen = BITS_TO_LONGS(subpages) * sizeof(unsigned long);
> -        balloon->pbp = g_malloc0(sizeof(PartiallyBalloonedPage) + bitlen);
> -        balloon->pbp->rb = rb;
> -        balloon->pbp->base = rb_aligned_offset;
> +    if (!*pbp) {
> +        *pbp = virtio_balloon_pbp_alloc(base_gpa, subpages);
>      }
>  
> -    set_bit((rb_offset - balloon->pbp->base) / BALLOON_PAGE_SIZE,
> -            balloon->pbp->bitmap);
> +    set_bit((rb_offset - rb_aligned_offset) / BALLOON_PAGE_SIZE,
> +            (*pbp)->bitmap);
>  
> -    if (bitmap_full(balloon->pbp->bitmap, subpages)) {
> +    if (bitmap_full((*pbp)->bitmap, subpages)) {
>          /* We've accumulated a full host page, we can actually discard
>           * it now */
>  
> -        ram_block_discard_range(rb, balloon->pbp->base, rb_page_size);
> +        ram_block_discard_range(rb, rb_aligned_offset, rb_page_size);
>          /* We ignore errors from ram_block_discard_range(), because it
>           * has already reported them, and failing to discard a balloon
>           * page is not fatal */
> -
> -        g_free(balloon->pbp);
> -        balloon->pbp = NULL;
> +        virtio_balloon_pbp_free(*pbp);
> +        *pbp = NULL;
>      }
>  }
>  
> @@ -128,7 +151,7 @@ static void balloon_deflate_page(VirtIOBalloon *balloon,
>  
>      if (balloon->pbp) {
>          /* Let's play safe and always reset the pbp on deflation requests. */
> -        g_free(balloon->pbp);
> +        virtio_balloon_pbp_free(balloon->pbp);
>          balloon->pbp = NULL;
>      }
>  

-- 
David Gibson			| I'll have my music baroque, and my code
david AT gibson.dropbear.id.au	| minimalist, thank you.  NOT _the_ _other_
				| _way_ _around_!
http://www.ozlabs.org/~dgibson

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: [Qemu-devel] [PATCH-for-4.1 v4 7/7] virtio-balloon: No need to track subpages for the PBP anymore
  2019-07-25 11:36 ` [Qemu-devel] [PATCH-for-4.1 v4 7/7] virtio-balloon: No need to track subpages for the PBP anymore David Hildenbrand
  2019-07-25 15:32   ` [Qemu-devel] [PULL 11/12] virtio-balloon: don't track subpages for the PBP Michael S. Tsirkin
@ 2019-07-26  8:10   ` David Gibson
  1 sibling, 0 replies; 53+ messages in thread
From: David Gibson @ 2019-07-26  8:10 UTC (permalink / raw)
  To: David Hildenbrand
  Cc: Igor Mammedov, qemu-stable, qemu-devel, Stefan Hajnoczi,
	Michael S . Tsirkin

[-- Attachment #1: Type: text/plain, Size: 2377 bytes --]

On Thu, Jul 25, 2019 at 01:36:38PM +0200, David Hildenbrand wrote:
> As ramblocks cannot get removed/readded while we are processing a bulk
> of inflation requests, there is no more need to track the page size
> in form of the number of subpages.
> 
> Suggested-by: David Gibson <david@gibson.dropbear.id.au>
> Signed-off-by: David Hildenbrand <david@redhat.com>

Reviewed-by: David Gibson <david@gibson.dropbear.id.au>

> ---
>  hw/virtio/virtio-balloon.c | 8 +++-----
>  1 file changed, 3 insertions(+), 5 deletions(-)
> 
> diff --git a/hw/virtio/virtio-balloon.c b/hw/virtio/virtio-balloon.c
> index a6282d58d4..fe9664e42c 100644
> --- a/hw/virtio/virtio-balloon.c
> +++ b/hw/virtio/virtio-balloon.c
> @@ -36,7 +36,6 @@
>  
>  typedef struct PartiallyBalloonedPage {
>      ram_addr_t base_gpa;
> -    long subpages;
>      unsigned long *bitmap;
>  } PartiallyBalloonedPage;
>  
> @@ -55,16 +54,15 @@ static PartiallyBalloonedPage *virtio_balloon_pbp_alloc(ram_addr_t base_gpa,
>      PartiallyBalloonedPage *pbp = g_new0(PartiallyBalloonedPage, 1);
>  
>      pbp->base_gpa = base_gpa;
> -    pbp->subpages = subpages;
>      pbp->bitmap = bitmap_new(subpages);
>  
>      return pbp;
>  }
>  
>  static bool virtio_balloon_pbp_matches(PartiallyBalloonedPage *pbp,
> -                                       ram_addr_t base_gpa, long subpages)
> +                                       ram_addr_t base_gpa)
>  {
> -    return pbp->subpages == subpages && pbp->base_gpa == base_gpa;
> +    return pbp->base_gpa == base_gpa;
>  }
>  
>  static void balloon_inflate_page(VirtIOBalloon *balloon,
> @@ -106,7 +104,7 @@ static void balloon_inflate_page(VirtIOBalloon *balloon,
>      base_gpa = memory_region_get_ram_addr(mr) + mr_offset -
>                 (rb_offset - rb_aligned_offset);
>  
> -    if (*pbp && !virtio_balloon_pbp_matches(*pbp, base_gpa, subpages)) {
> +    if (*pbp && !virtio_balloon_pbp_matches(*pbp, base_gpa)) {
>          /* We've partially ballooned part of a host page, but now
>           * we're trying to balloon part of a different one.  Too hard,
>           * give up on the old partial page */

-- 
David Gibson			| I'll have my music baroque, and my code
david AT gibson.dropbear.id.au	| minimalist, thank you.  NOT _the_ _other_
				| _way_ _around_!
http://www.ozlabs.org/~dgibson

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 53+ messages in thread

* Re: [Qemu-devel] [PULL 00/12] virtio, pc: fixes, cleanups
  2019-07-25 15:31 [Qemu-devel] [PULL 00/12] virtio, pc: fixes, cleanups Michael S. Tsirkin
                   ` (4 preceding siblings ...)
  2019-07-25 15:32 ` [Qemu-devel] [PULL 12/12] virtio-balloon: free pbp more aggressively Michael S. Tsirkin
@ 2019-07-26  9:53 ` Peter Maydell
  5 siblings, 0 replies; 53+ messages in thread
From: Peter Maydell @ 2019-07-26  9:53 UTC (permalink / raw)
  To: Michael S. Tsirkin; +Cc: QEMU Developers

On Thu, 25 Jul 2019 at 16:31, Michael S. Tsirkin <mst@redhat.com> wrote:
>
> The following changes since commit bf8b024372bf8abf5a9f40bfa65eeefad23ff988:
>
>   Update version for v4.1.0-rc2 release (2019-07-23 18:28:08 +0100)
>
> are available in the Git repository at:
>
>   git://git.kernel.org/pub/scm/virt/kvm/mst/qemu.git tags/for_upstream
>
> for you to fetch changes up to 1b47b37c33ec01ae1efc527f4c97f97f93723bc4:
>
>   virtio-balloon: free pbp more aggressively (2019-07-25 11:19:25 -0400)
>
> ----------------------------------------------------------------
> virtio, pc: fixes, cleanups
>
> A bunch of fixes all over the place.
>
> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
>


Applied, thanks.

Please update the changelog at https://wiki.qemu.org/ChangeLog/4.1
for any user-visible changes.

-- PMM


^ permalink raw reply	[flat|nested] 53+ messages in thread

end of thread, other threads:[~2019-07-26  9:53 UTC | newest]

Thread overview: 53+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-07-25 15:31 [Qemu-devel] [PULL 00/12] virtio, pc: fixes, cleanups Michael S. Tsirkin
2019-06-02 11:42 ` [Qemu-devel] [PATCH] ioapic: kvm: Skip route updates for masked pins Jan Kiszka
2019-06-02 12:10   ` Peter Xu
2019-06-03  6:30     ` Jan Kiszka
2019-06-03  0:36   ` Michael S. Tsirkin
2019-07-21  8:58     ` Jan Kiszka
2019-07-21 10:04       ` Michael S. Tsirkin
2019-07-21 16:55         ` Paolo Bonzini
2019-07-25 15:31   ` [Qemu-devel] [PULL 03/12] " Michael S. Tsirkin
2019-06-24  9:13 ` [Qemu-devel] [PATCH] docs: clarify multiqueue vs multiple virtqueues Stefan Hajnoczi
2019-06-24 10:19   ` Marc-André Lureau
2019-07-17 10:14   ` Stefan Hajnoczi
2019-07-17 10:35     ` Michael S. Tsirkin
2019-07-25 15:31   ` [Qemu-devel] [PULL 01/12] " Michael S. Tsirkin
2019-07-18 16:14 ` [Qemu-devel] [PATCH v2] i386/acpi: fix gint overflow in crs_range_compare Evgeny Yakovlev
2019-07-18 20:30   ` Michael S. Tsirkin
2019-07-25 15:31   ` [Qemu-devel] [PULL 02/12] " Michael S. Tsirkin
2019-07-19  8:54 ` [Qemu-devel] [PATCH] i386/acpi: show PCI Express bus on pxb-pcie expanders Evgeny Yakovlev
2019-07-19 12:14   ` Igor Mammedov
2019-07-25 15:31   ` [Qemu-devel] [PULL 04/12] " Michael S. Tsirkin
2019-07-25 15:32 ` [Qemu-devel] [PULL 12/12] virtio-balloon: free pbp more aggressively Michael S. Tsirkin
2019-07-26  9:53 ` [Qemu-devel] [PULL 00/12] virtio, pc: fixes, cleanups Peter Maydell
  -- strict thread matches above, loose matches on Subject: below --
2019-07-25 11:36 [Qemu-devel] [PATCH-for-4.1 v4 0/7] virtio-balloon: fixes David Hildenbrand
2019-07-25 11:36 ` [Qemu-devel] [PATCH-for-4.1 v4 1/7] virtio-balloon: Fix wrong sign extension of PFNs David Hildenbrand
2019-07-25 12:36   ` Pankaj Gupta
2019-07-25 11:36 ` [Qemu-devel] [PATCH-for-4.1 v4 2/7] virtio-balloon: Fix QEMU crashes on pagesize > BALLOON_PAGE_SIZE David Hildenbrand
2019-07-25 11:36 ` [Qemu-devel] [PATCH-for-4.1 v4 3/7] virtio-balloon: Simplify deflate with pbp David Hildenbrand
2019-07-25 11:36 ` [Qemu-devel] [PATCH-for-4.1 v4 4/7] virtio-balloon: Better names for offset variables in inflate/deflate code David Hildenbrand
2019-07-25 11:36 ` [Qemu-devel] [PATCH-for-4.1 v4 5/7] virtio-balloon: Rework pbp tracking data David Hildenbrand
2019-07-26  8:08   ` David Gibson
2019-07-25 11:36 ` [Qemu-devel] [PATCH-for-4.1 v4 6/7] virtio-balloon: Use temporary PBP only David Hildenbrand
2019-07-25 11:53   ` Michael S. Tsirkin
2019-07-25 11:56     ` David Hildenbrand
2019-07-25 11:36 ` [Qemu-devel] [PATCH-for-4.1 v4 7/7] virtio-balloon: No need to track subpages for the PBP anymore David Hildenbrand
2019-07-25 15:32   ` [Qemu-devel] [PULL 11/12] virtio-balloon: don't track subpages for the PBP Michael S. Tsirkin
2019-07-26  8:10   ` [Qemu-devel] [PATCH-for-4.1 v4 7/7] virtio-balloon: No need to track subpages for the PBP anymore David Gibson
2019-07-22 13:41 [Qemu-devel] [PATCH-for-4.1 v3 0/6] virtio-balloon: fixes David Hildenbrand
2019-07-22 13:41 ` [Qemu-devel] [PATCH-for-4.1 v3 1/6] virtio-balloon: Fix wrong sign extension of PFNs David Hildenbrand
2019-07-23  2:27   ` David Gibson
2019-07-25 15:31   ` [Qemu-devel] [PULL 05/12] " Michael S. Tsirkin
2019-07-22 13:41 ` [Qemu-devel] [PATCH-for-4.1 v3 2/6] virtio-balloon: Fix QEMU crashes on pagesize > BALLOON_PAGE_SIZE David Hildenbrand
2019-07-25 15:31   ` [Qemu-devel] [PULL 06/12] " Michael S. Tsirkin
2019-07-22 13:41 ` [Qemu-devel] [PATCH-for-4.1 v3 3/6] virtio-balloon: Simplify deflate with pbp David Hildenbrand
2019-07-25 15:32   ` [Qemu-devel] [PULL 07/12] " Michael S. Tsirkin
2019-07-22 13:41 ` [Qemu-devel] [PATCH-for-4.1 v3 4/6] virtio-balloon: Better names for offset variables in inflate/deflate code David Hildenbrand
2019-07-25 15:32   ` [Qemu-devel] [PULL 08/12] " Michael S. Tsirkin
2019-07-22 13:41 ` [Qemu-devel] [PATCH-for-4.1 v3 5/6] virtio-balloon: Rework pbp tracking data David Hildenbrand
2019-07-23  2:54   ` David Gibson
2019-07-23  7:38     ` David Hildenbrand
2019-07-25 15:32   ` [Qemu-devel] [PULL 09/12] " Michael S. Tsirkin
2019-07-22 13:41 ` [Qemu-devel] [PATCH-for-4.1 v3 6/6] virtio-balloon: Use temporary PBP only David Hildenbrand
2019-07-23  3:22   ` David Gibson
2019-07-25 15:32   ` [Qemu-devel] [PULL 10/12] " Michael S. Tsirkin

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).