All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH V3 1/3] kvm: use kmalloc() instead of kzalloc() during iodev register/unregister
@ 2015-08-25  9:05 Jason Wang
  2015-08-25  9:05 ` [PATCH V3 2/3] kvm: don't register wildcard MMIO EVENTFD on two buses Jason Wang
                   ` (2 more replies)
  0 siblings, 3 replies; 19+ messages in thread
From: Jason Wang @ 2015-08-25  9:05 UTC (permalink / raw)
  To: gleb, pbonzini, kvm, linux-kernel
  Cc: cornelia.huck, Jason Wang, Michael S. Tsirkin

All fields of kvm_io_range were initialized or copied explicitly
afterwards. So switch to use kmalloc().

Cc: Gleb Natapov <gleb@kernel.org>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
---
 virt/kvm/kvm_main.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c
index 8b8a444..0d79fe8 100644
--- a/virt/kvm/kvm_main.c
+++ b/virt/kvm/kvm_main.c
@@ -3248,7 +3248,7 @@ int kvm_io_bus_register_dev(struct kvm *kvm, enum kvm_bus bus_idx, gpa_t addr,
 	if (bus->dev_count - bus->ioeventfd_count > NR_IOBUS_DEVS - 1)
 		return -ENOSPC;
 
-	new_bus = kzalloc(sizeof(*bus) + ((bus->dev_count + 1) *
+	new_bus = kmalloc(sizeof(*bus) + ((bus->dev_count + 1) *
 			  sizeof(struct kvm_io_range)), GFP_KERNEL);
 	if (!new_bus)
 		return -ENOMEM;
@@ -3280,7 +3280,7 @@ int kvm_io_bus_unregister_dev(struct kvm *kvm, enum kvm_bus bus_idx,
 	if (r)
 		return r;
 
-	new_bus = kzalloc(sizeof(*bus) + ((bus->dev_count - 1) *
+	new_bus = kmalloc(sizeof(*bus) + ((bus->dev_count - 1) *
 			  sizeof(struct kvm_io_range)), GFP_KERNEL);
 	if (!new_bus)
 		return -ENOMEM;
-- 
2.1.4


^ permalink raw reply related	[flat|nested] 19+ messages in thread

* [PATCH V3 2/3] kvm: don't register wildcard MMIO EVENTFD on two buses
  2015-08-25  9:05 [PATCH V3 1/3] kvm: use kmalloc() instead of kzalloc() during iodev register/unregister Jason Wang
@ 2015-08-25  9:05 ` Jason Wang
  2015-08-25 10:24   ` Cornelia Huck
                     ` (2 more replies)
  2015-08-25  9:05 ` [PATCH V3 3/3] kvm: add tracepoint for fast mmio Jason Wang
  2015-09-07 10:33 ` [PATCH V3 1/3] kvm: use kmalloc() instead of kzalloc() during iodev register/unregister Paolo Bonzini
  2 siblings, 3 replies; 19+ messages in thread
From: Jason Wang @ 2015-08-25  9:05 UTC (permalink / raw)
  To: gleb, pbonzini, kvm, linux-kernel
  Cc: cornelia.huck, Jason Wang, Michael S. Tsirkin

We register wildcard mmio eventfd on two buses, one for KVM_MMIO_BUS
and another is KVM_FAST_MMIO_BUS. This leads to issue:

- kvm_io_bus_destroy() knows nothing about the devices on two buses
  points to a single dev. Which will lead double free [1] during exit.
- wildcard eventfd ignores data len, so it was registered as a
  kvm_io_range with zero length. This will fail the binary search in
  kvm_io_bus_get_first_dev() when we try to emulate through
  KVM_MMIO_BUS. This will cause userspace io emulation request instead
  of a eventfd notification (virtqueue kick will be trapped by qemu
  instead of vhost in this case).

Fixing this by don't register wildcard mmio eventfd on two
buses. Instead, only register it in KVM_FAST_MMIO_BUS. This fixes the
double free issue of kvm_io_bus_destroy(). For the arch/setups that
does not utilize KVM_FAST_MMIO_BUS, before searching KVM_MMIO_BUS, try
KVM_FAST_MMIO_BUS first to see it it has a match.

[1] Panic caused by double free:

CPU: 1 PID: 2894 Comm: qemu-system-x86 Not tainted 3.19.0-26-generic #28-Ubuntu
Hardware name: LENOVO 2356BG6/2356BG6, BIOS G7ET96WW (2.56 ) 09/12/2013
task: ffff88009ae0c4b0 ti: ffff88020e7f0000 task.ti: ffff88020e7f0000
RIP: 0010:[<ffffffffc07e25d8>]  [<ffffffffc07e25d8>] ioeventfd_release+0x28/0x60 [kvm]
RSP: 0018:ffff88020e7f3bc8  EFLAGS: 00010292
RAX: dead000000200200 RBX: ffff8801ec19c900 RCX: 000000018200016d
RDX: ffff8801ec19cf80 RSI: ffffea0008bf1d40 RDI: ffff8801ec19c900
RBP: ffff88020e7f3bd8 R08: 000000002fc75a01 R09: 000000018200016d
R10: ffffffffc07df6ae R11: ffff88022fc75a98 R12: ffff88021e7cc000
R13: ffff88021e7cca48 R14: ffff88021e7cca50 R15: ffff8801ec19c880
FS:  00007fc1ee3e6700(0000) GS:ffff88023e240000(0000) knlGS:0000000000000000
CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 00007f8f389d8000 CR3: 000000023dc13000 CR4: 00000000001427e0
Stack:
ffff88021e7cc000 0000000000000000 ffff88020e7f3be8 ffffffffc07e2622
ffff88020e7f3c38 ffffffffc07df69a ffff880232524160 ffff88020e792d80
 0000000000000000 ffff880219b78c00 0000000000000008 ffff8802321686a8
Call Trace:
[<ffffffffc07e2622>] ioeventfd_destructor+0x12/0x20 [kvm]
[<ffffffffc07df69a>] kvm_put_kvm+0xca/0x210 [kvm]
[<ffffffffc07df818>] kvm_vcpu_release+0x18/0x20 [kvm]
[<ffffffff811f69f7>] __fput+0xe7/0x250
[<ffffffff811f6bae>] ____fput+0xe/0x10
[<ffffffff81093f04>] task_work_run+0xd4/0xf0
[<ffffffff81079358>] do_exit+0x368/0xa50
[<ffffffff81082c8f>] ? recalc_sigpending+0x1f/0x60
[<ffffffff81079ad5>] do_group_exit+0x45/0xb0
[<ffffffff81085c71>] get_signal+0x291/0x750
[<ffffffff810144d8>] do_signal+0x28/0xab0
[<ffffffff810f3a3b>] ? do_futex+0xdb/0x5d0
[<ffffffff810b7028>] ? __wake_up_locked_key+0x18/0x20
[<ffffffff810f3fa6>] ? SyS_futex+0x76/0x170
[<ffffffff81014fc9>] do_notify_resume+0x69/0xb0
[<ffffffff817cb9af>] int_signal+0x12/0x17
Code: 5d c3 90 0f 1f 44 00 00 55 48 89 e5 53 48 89 fb 48 83 ec 08 48 8b 7f 20 e8 06 d6 a5 c0 48 8b 43 08 48 8b 13 48 89 df 48 89 42 08 <48> 89 10 48 b8 00 01 10 00 00
RIP  [<ffffffffc07e25d8>] ioeventfd_release+0x28/0x60 [kvm]
RSP <ffff88020e7f3bc8>

Cc: Gleb Natapov <gleb@kernel.org>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
---
Changes from V2:
- Tweak styles and comment suggested by Cornelia.
Changes from v1:
- change ioeventfd_bus_from_flags() to return KVM_FAST_MMIO_BUS when
  needed to save lots of unnecessary changes.
---
 virt/kvm/eventfd.c  | 31 +++++++++----------------------
 virt/kvm/kvm_main.c | 16 ++++++++++++++--
 2 files changed, 23 insertions(+), 24 deletions(-)

diff --git a/virt/kvm/eventfd.c b/virt/kvm/eventfd.c
index 9ff4193..c3ffdc3 100644
--- a/virt/kvm/eventfd.c
+++ b/virt/kvm/eventfd.c
@@ -762,13 +762,16 @@ ioeventfd_check_collision(struct kvm *kvm, struct _ioeventfd *p)
 	return false;
 }
 
-static enum kvm_bus ioeventfd_bus_from_flags(__u32 flags)
+static enum kvm_bus ioeventfd_bus_from_args(struct kvm_ioeventfd *args)
 {
-	if (flags & KVM_IOEVENTFD_FLAG_PIO)
+	if (args->flags & KVM_IOEVENTFD_FLAG_PIO)
 		return KVM_PIO_BUS;
-	if (flags & KVM_IOEVENTFD_FLAG_VIRTIO_CCW_NOTIFY)
+	if (args->flags & KVM_IOEVENTFD_FLAG_VIRTIO_CCW_NOTIFY)
 		return KVM_VIRTIO_CCW_NOTIFY_BUS;
-	return KVM_MMIO_BUS;
+	/* When length is ignored, MMIO is put on a separate bus, for
+	 * faster lookups.
+	 */
+	return args->len ? KVM_MMIO_BUS : KVM_FAST_MMIO_BUS;
 }
 
 static int
@@ -779,7 +782,7 @@ kvm_assign_ioeventfd(struct kvm *kvm, struct kvm_ioeventfd *args)
 	struct eventfd_ctx       *eventfd;
 	int                       ret;
 
-	bus_idx = ioeventfd_bus_from_flags(args->flags);
+	bus_idx = ioeventfd_bus_from_args(args);
 	/* must be natural-word sized, or 0 to ignore length */
 	switch (args->len) {
 	case 0:
@@ -843,16 +846,6 @@ kvm_assign_ioeventfd(struct kvm *kvm, struct kvm_ioeventfd *args)
 	if (ret < 0)
 		goto unlock_fail;
 
-	/* When length is ignored, MMIO is also put on a separate bus, for
-	 * faster lookups.
-	 */
-	if (!args->len && !(args->flags & KVM_IOEVENTFD_FLAG_PIO)) {
-		ret = kvm_io_bus_register_dev(kvm, KVM_FAST_MMIO_BUS,
-					      p->addr, 0, &p->dev);
-		if (ret < 0)
-			goto register_fail;
-	}
-
 	kvm->buses[bus_idx]->ioeventfd_count++;
 	list_add_tail(&p->list, &kvm->ioeventfds);
 
@@ -860,8 +853,6 @@ kvm_assign_ioeventfd(struct kvm *kvm, struct kvm_ioeventfd *args)
 
 	return 0;
 
-register_fail:
-	kvm_io_bus_unregister_dev(kvm, bus_idx, &p->dev);
 unlock_fail:
 	mutex_unlock(&kvm->slots_lock);
 
@@ -880,7 +871,7 @@ kvm_deassign_ioeventfd(struct kvm *kvm, struct kvm_ioeventfd *args)
 	struct eventfd_ctx       *eventfd;
 	int                       ret = -ENOENT;
 
-	bus_idx = ioeventfd_bus_from_flags(args->flags);
+	bus_idx = ioeventfd_bus_from_args(args);
 	eventfd = eventfd_ctx_fdget(args->fd);
 	if (IS_ERR(eventfd))
 		return PTR_ERR(eventfd);
@@ -901,10 +892,6 @@ kvm_deassign_ioeventfd(struct kvm *kvm, struct kvm_ioeventfd *args)
 			continue;
 
 		kvm_io_bus_unregister_dev(kvm, bus_idx, &p->dev);
-		if (!p->length) {
-			kvm_io_bus_unregister_dev(kvm, KVM_FAST_MMIO_BUS,
-						  &p->dev);
-		}
 		kvm->buses[bus_idx]->ioeventfd_count--;
 		ioeventfd_release(p);
 		ret = 0;
diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c
index 0d79fe8..3e3b3bc 100644
--- a/virt/kvm/kvm_main.c
+++ b/virt/kvm/kvm_main.c
@@ -3152,8 +3152,8 @@ static int __kvm_io_bus_write(struct kvm_vcpu *vcpu, struct kvm_io_bus *bus,
 }
 
 /* kvm_io_bus_write - called under kvm->slots_lock */
-int kvm_io_bus_write(struct kvm_vcpu *vcpu, enum kvm_bus bus_idx, gpa_t addr,
-		     int len, const void *val)
+static int kvm_io_bus_write_idx(struct kvm_vcpu *vcpu, enum kvm_bus bus_idx,
+				gpa_t addr, int len, const void *val)
 {
 	struct kvm_io_bus *bus;
 	struct kvm_io_range range;
@@ -3169,6 +3169,18 @@ int kvm_io_bus_write(struct kvm_vcpu *vcpu, enum kvm_bus bus_idx, gpa_t addr,
 	return r < 0 ? r : 0;
 }
 
+int kvm_io_bus_write(struct kvm_vcpu *vcpu, enum kvm_bus bus_idx, gpa_t addr,
+		     int len, const void *val)
+{
+	int r = 0;
+
+	if (bus_idx != KVM_MMIO_BUS ||
+	    kvm_io_bus_write_idx(vcpu, KVM_FAST_MMIO_BUS, addr, 0, NULL) < 0)
+		r = kvm_io_bus_write_idx(vcpu, bus_idx, addr, len, val);
+
+	return r < 0 ? r : 0;
+}
+
 /* kvm_io_bus_write_cookie - called under kvm->slots_lock */
 int kvm_io_bus_write_cookie(struct kvm_vcpu *vcpu, enum kvm_bus bus_idx,
 			    gpa_t addr, int len, const void *val, long cookie)
-- 
2.1.4


^ permalink raw reply related	[flat|nested] 19+ messages in thread

* [PATCH V3 3/3] kvm: add tracepoint for fast mmio
  2015-08-25  9:05 [PATCH V3 1/3] kvm: use kmalloc() instead of kzalloc() during iodev register/unregister Jason Wang
  2015-08-25  9:05 ` [PATCH V3 2/3] kvm: don't register wildcard MMIO EVENTFD on two buses Jason Wang
@ 2015-08-25  9:05 ` Jason Wang
  2015-08-25 11:36   ` Michael S. Tsirkin
  2015-09-07 10:33 ` [PATCH V3 1/3] kvm: use kmalloc() instead of kzalloc() during iodev register/unregister Paolo Bonzini
  2 siblings, 1 reply; 19+ messages in thread
From: Jason Wang @ 2015-08-25  9:05 UTC (permalink / raw)
  To: gleb, pbonzini, kvm, linux-kernel
  Cc: cornelia.huck, Jason Wang, Michael S. Tsirkin

Cc: Gleb Natapov <gleb@kernel.org>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
---
 arch/x86/kvm/trace.h | 17 +++++++++++++++++
 arch/x86/kvm/vmx.c   |  1 +
 arch/x86/kvm/x86.c   |  1 +
 3 files changed, 19 insertions(+)

diff --git a/arch/x86/kvm/trace.h b/arch/x86/kvm/trace.h
index 4eae7c3..2d4e81a 100644
--- a/arch/x86/kvm/trace.h
+++ b/arch/x86/kvm/trace.h
@@ -128,6 +128,23 @@ TRACE_EVENT(kvm_pio,
 		  __entry->count > 1 ? "(...)" : "")
 );
 
+TRACE_EVENT(kvm_fast_mmio,
+	TP_PROTO(u64 gpa),
+	TP_ARGS(gpa),
+
+	TP_STRUCT__entry(
+		__field(u64,	gpa)
+	),
+
+	TP_fast_assign(
+		__entry->gpa		= gpa;
+	),
+
+	TP_printk("fast mmio at gpa 0x%llx", __entry->gpa)
+);
+
+
+
 /*
  * Tracepoint for cpuid.
  */
diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c
index 83b7b5c..a55d279 100644
--- a/arch/x86/kvm/vmx.c
+++ b/arch/x86/kvm/vmx.c
@@ -5831,6 +5831,7 @@ static int handle_ept_misconfig(struct kvm_vcpu *vcpu)
 	gpa = vmcs_read64(GUEST_PHYSICAL_ADDRESS);
 	if (!kvm_io_bus_write(vcpu, KVM_FAST_MMIO_BUS, gpa, 0, NULL)) {
 		skip_emulated_instruction(vcpu);
+		trace_kvm_fast_mmio(gpa);
 		return 1;
 	}
 
diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index 8f0f6ec..36cf78e 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -8254,6 +8254,7 @@ bool kvm_arch_has_noncoherent_dma(struct kvm *kvm)
 EXPORT_SYMBOL_GPL(kvm_arch_has_noncoherent_dma);
 
 EXPORT_TRACEPOINT_SYMBOL_GPL(kvm_exit);
+EXPORT_TRACEPOINT_SYMBOL_GPL(kvm_fast_mmio);
 EXPORT_TRACEPOINT_SYMBOL_GPL(kvm_inj_virq);
 EXPORT_TRACEPOINT_SYMBOL_GPL(kvm_page_fault);
 EXPORT_TRACEPOINT_SYMBOL_GPL(kvm_msr);
-- 
2.1.4


^ permalink raw reply related	[flat|nested] 19+ messages in thread

* Re: [PATCH V3 2/3] kvm: don't register wildcard MMIO EVENTFD on two buses
  2015-08-25  9:05 ` [PATCH V3 2/3] kvm: don't register wildcard MMIO EVENTFD on two buses Jason Wang
@ 2015-08-25 10:24   ` Cornelia Huck
  2015-08-25 11:37   ` Michael S. Tsirkin
  2015-08-25 11:51   ` Michael S. Tsirkin
  2 siblings, 0 replies; 19+ messages in thread
From: Cornelia Huck @ 2015-08-25 10:24 UTC (permalink / raw)
  To: Jason Wang; +Cc: gleb, pbonzini, kvm, linux-kernel, Michael S. Tsirkin

On Tue, 25 Aug 2015 17:05:47 +0800
Jason Wang <jasowang@redhat.com> wrote:

> We register wildcard mmio eventfd on two buses, one for KVM_MMIO_BUS
> and another is KVM_FAST_MMIO_BUS. This leads to issue:
> 
> - kvm_io_bus_destroy() knows nothing about the devices on two buses
>   points to a single dev. Which will lead double free [1] during exit.
> - wildcard eventfd ignores data len, so it was registered as a
>   kvm_io_range with zero length. This will fail the binary search in
>   kvm_io_bus_get_first_dev() when we try to emulate through
>   KVM_MMIO_BUS. This will cause userspace io emulation request instead
>   of a eventfd notification (virtqueue kick will be trapped by qemu
>   instead of vhost in this case).
> 
> Fixing this by don't register wildcard mmio eventfd on two
> buses. Instead, only register it in KVM_FAST_MMIO_BUS. This fixes the
> double free issue of kvm_io_bus_destroy(). For the arch/setups that
> does not utilize KVM_FAST_MMIO_BUS, before searching KVM_MMIO_BUS, try
> KVM_FAST_MMIO_BUS first to see it it has a match.
> 
> [1] Panic caused by double free:
> 
> CPU: 1 PID: 2894 Comm: qemu-system-x86 Not tainted 3.19.0-26-generic #28-Ubuntu
> Hardware name: LENOVO 2356BG6/2356BG6, BIOS G7ET96WW (2.56 ) 09/12/2013
> task: ffff88009ae0c4b0 ti: ffff88020e7f0000 task.ti: ffff88020e7f0000
> RIP: 0010:[<ffffffffc07e25d8>]  [<ffffffffc07e25d8>] ioeventfd_release+0x28/0x60 [kvm]
> RSP: 0018:ffff88020e7f3bc8  EFLAGS: 00010292
> RAX: dead000000200200 RBX: ffff8801ec19c900 RCX: 000000018200016d
> RDX: ffff8801ec19cf80 RSI: ffffea0008bf1d40 RDI: ffff8801ec19c900
> RBP: ffff88020e7f3bd8 R08: 000000002fc75a01 R09: 000000018200016d
> R10: ffffffffc07df6ae R11: ffff88022fc75a98 R12: ffff88021e7cc000
> R13: ffff88021e7cca48 R14: ffff88021e7cca50 R15: ffff8801ec19c880
> FS:  00007fc1ee3e6700(0000) GS:ffff88023e240000(0000) knlGS:0000000000000000
> CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> CR2: 00007f8f389d8000 CR3: 000000023dc13000 CR4: 00000000001427e0
> Stack:
> ffff88021e7cc000 0000000000000000 ffff88020e7f3be8 ffffffffc07e2622
> ffff88020e7f3c38 ffffffffc07df69a ffff880232524160 ffff88020e792d80
>  0000000000000000 ffff880219b78c00 0000000000000008 ffff8802321686a8
> Call Trace:
> [<ffffffffc07e2622>] ioeventfd_destructor+0x12/0x20 [kvm]
> [<ffffffffc07df69a>] kvm_put_kvm+0xca/0x210 [kvm]
> [<ffffffffc07df818>] kvm_vcpu_release+0x18/0x20 [kvm]
> [<ffffffff811f69f7>] __fput+0xe7/0x250
> [<ffffffff811f6bae>] ____fput+0xe/0x10
> [<ffffffff81093f04>] task_work_run+0xd4/0xf0
> [<ffffffff81079358>] do_exit+0x368/0xa50
> [<ffffffff81082c8f>] ? recalc_sigpending+0x1f/0x60
> [<ffffffff81079ad5>] do_group_exit+0x45/0xb0
> [<ffffffff81085c71>] get_signal+0x291/0x750
> [<ffffffff810144d8>] do_signal+0x28/0xab0
> [<ffffffff810f3a3b>] ? do_futex+0xdb/0x5d0
> [<ffffffff810b7028>] ? __wake_up_locked_key+0x18/0x20
> [<ffffffff810f3fa6>] ? SyS_futex+0x76/0x170
> [<ffffffff81014fc9>] do_notify_resume+0x69/0xb0
> [<ffffffff817cb9af>] int_signal+0x12/0x17
> Code: 5d c3 90 0f 1f 44 00 00 55 48 89 e5 53 48 89 fb 48 83 ec 08 48 8b 7f 20 e8 06 d6 a5 c0 48 8b 43 08 48 8b 13 48 89 df 48 89 42 08 <48> 89 10 48 b8 00 01 10 00 00
> RIP  [<ffffffffc07e25d8>] ioeventfd_release+0x28/0x60 [kvm]
> RSP <ffff88020e7f3bc8>
> 
> Cc: Gleb Natapov <gleb@kernel.org>
> Cc: Paolo Bonzini <pbonzini@redhat.com>
> Cc: Michael S. Tsirkin <mst@redhat.com>
> Signed-off-by: Jason Wang <jasowang@redhat.com>
> ---
> Changes from V2:
> - Tweak styles and comment suggested by Cornelia.
> Changes from v1:
> - change ioeventfd_bus_from_flags() to return KVM_FAST_MMIO_BUS when
>   needed to save lots of unnecessary changes.
> ---
>  virt/kvm/eventfd.c  | 31 +++++++++----------------------
>  virt/kvm/kvm_main.c | 16 ++++++++++++++--
>  2 files changed, 23 insertions(+), 24 deletions(-)

Acked-by: Cornelia Huck <cornelia.huck@de.ibm.com>


^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH V3 3/3] kvm: add tracepoint for fast mmio
  2015-08-25  9:05 ` [PATCH V3 3/3] kvm: add tracepoint for fast mmio Jason Wang
@ 2015-08-25 11:36   ` Michael S. Tsirkin
  0 siblings, 0 replies; 19+ messages in thread
From: Michael S. Tsirkin @ 2015-08-25 11:36 UTC (permalink / raw)
  To: Jason Wang; +Cc: gleb, pbonzini, kvm, linux-kernel, cornelia.huck

On Tue, Aug 25, 2015 at 05:05:48PM +0800, Jason Wang wrote:
> Cc: Gleb Natapov <gleb@kernel.org>
> Cc: Paolo Bonzini <pbonzini@redhat.com>
> Cc: Michael S. Tsirkin <mst@redhat.com>
> Signed-off-by: Jason Wang <jasowang@redhat.com>
> ---
>  arch/x86/kvm/trace.h | 17 +++++++++++++++++
>  arch/x86/kvm/vmx.c   |  1 +
>  arch/x86/kvm/x86.c   |  1 +
>  3 files changed, 19 insertions(+)
> 
> diff --git a/arch/x86/kvm/trace.h b/arch/x86/kvm/trace.h
> index 4eae7c3..2d4e81a 100644
> --- a/arch/x86/kvm/trace.h
> +++ b/arch/x86/kvm/trace.h
> @@ -128,6 +128,23 @@ TRACE_EVENT(kvm_pio,
>  		  __entry->count > 1 ? "(...)" : "")
>  );
>  
> +TRACE_EVENT(kvm_fast_mmio,
> +	TP_PROTO(u64 gpa),
> +	TP_ARGS(gpa),
> +
> +	TP_STRUCT__entry(
> +		__field(u64,	gpa)
> +	),
> +
> +	TP_fast_assign(
> +		__entry->gpa		= gpa;
> +	),
> +
> +	TP_printk("fast mmio at gpa 0x%llx", __entry->gpa)
> +);
> +
> +
> +

don't add multiple empty lines pls.

>  /*
>   * Tracepoint for cpuid.
>   */
> diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c
> index 83b7b5c..a55d279 100644
> --- a/arch/x86/kvm/vmx.c
> +++ b/arch/x86/kvm/vmx.c
> @@ -5831,6 +5831,7 @@ static int handle_ept_misconfig(struct kvm_vcpu *vcpu)
>  	gpa = vmcs_read64(GUEST_PHYSICAL_ADDRESS);
>  	if (!kvm_io_bus_write(vcpu, KVM_FAST_MMIO_BUS, gpa, 0, NULL)) {
>  		skip_emulated_instruction(vcpu);
> +		trace_kvm_fast_mmio(gpa);
>  		return 1;
>  	}
>  
> diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
> index 8f0f6ec..36cf78e 100644
> --- a/arch/x86/kvm/x86.c
> +++ b/arch/x86/kvm/x86.c
> @@ -8254,6 +8254,7 @@ bool kvm_arch_has_noncoherent_dma(struct kvm *kvm)
>  EXPORT_SYMBOL_GPL(kvm_arch_has_noncoherent_dma);
>  
>  EXPORT_TRACEPOINT_SYMBOL_GPL(kvm_exit);
> +EXPORT_TRACEPOINT_SYMBOL_GPL(kvm_fast_mmio);
>  EXPORT_TRACEPOINT_SYMBOL_GPL(kvm_inj_virq);
>  EXPORT_TRACEPOINT_SYMBOL_GPL(kvm_page_fault);
>  EXPORT_TRACEPOINT_SYMBOL_GPL(kvm_msr);
> -- 
> 2.1.4

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH V3 2/3] kvm: don't register wildcard MMIO EVENTFD on two buses
  2015-08-25  9:05 ` [PATCH V3 2/3] kvm: don't register wildcard MMIO EVENTFD on two buses Jason Wang
  2015-08-25 10:24   ` Cornelia Huck
@ 2015-08-25 11:37   ` Michael S. Tsirkin
  2015-08-25 11:51   ` Michael S. Tsirkin
  2 siblings, 0 replies; 19+ messages in thread
From: Michael S. Tsirkin @ 2015-08-25 11:37 UTC (permalink / raw)
  To: Jason Wang; +Cc: gleb, pbonzini, kvm, linux-kernel, cornelia.huck

On Tue, Aug 25, 2015 at 05:05:47PM +0800, Jason Wang wrote:
> We register wildcard mmio eventfd on two buses, one for KVM_MMIO_BUS
> and another is KVM_FAST_MMIO_BUS. This leads to issue:
> 
> - kvm_io_bus_destroy() knows nothing about the devices on two buses
>   points to a single dev. Which will lead double free [1] during exit.
> - wildcard eventfd ignores data len, so it was registered as a
>   kvm_io_range with zero length. This will fail the binary search in
>   kvm_io_bus_get_first_dev() when we try to emulate through
>   KVM_MMIO_BUS. This will cause userspace io emulation request instead
>   of a eventfd notification (virtqueue kick will be trapped by qemu
>   instead of vhost in this case).
> 
> Fixing this by don't register wildcard mmio eventfd on two
> buses. Instead, only register it in KVM_FAST_MMIO_BUS. This fixes the
> double free issue of kvm_io_bus_destroy(). For the arch/setups that
> does not utilize KVM_FAST_MMIO_BUS, before searching KVM_MMIO_BUS, try
> KVM_FAST_MMIO_BUS first to see it it has a match.
> 
> [1] Panic caused by double free:
> 
> CPU: 1 PID: 2894 Comm: qemu-system-x86 Not tainted 3.19.0-26-generic #28-Ubuntu
> Hardware name: LENOVO 2356BG6/2356BG6, BIOS G7ET96WW (2.56 ) 09/12/2013
> task: ffff88009ae0c4b0 ti: ffff88020e7f0000 task.ti: ffff88020e7f0000
> RIP: 0010:[<ffffffffc07e25d8>]  [<ffffffffc07e25d8>] ioeventfd_release+0x28/0x60 [kvm]
> RSP: 0018:ffff88020e7f3bc8  EFLAGS: 00010292
> RAX: dead000000200200 RBX: ffff8801ec19c900 RCX: 000000018200016d
> RDX: ffff8801ec19cf80 RSI: ffffea0008bf1d40 RDI: ffff8801ec19c900
> RBP: ffff88020e7f3bd8 R08: 000000002fc75a01 R09: 000000018200016d
> R10: ffffffffc07df6ae R11: ffff88022fc75a98 R12: ffff88021e7cc000
> R13: ffff88021e7cca48 R14: ffff88021e7cca50 R15: ffff8801ec19c880
> FS:  00007fc1ee3e6700(0000) GS:ffff88023e240000(0000) knlGS:0000000000000000
> CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> CR2: 00007f8f389d8000 CR3: 000000023dc13000 CR4: 00000000001427e0
> Stack:
> ffff88021e7cc000 0000000000000000 ffff88020e7f3be8 ffffffffc07e2622
> ffff88020e7f3c38 ffffffffc07df69a ffff880232524160 ffff88020e792d80
>  0000000000000000 ffff880219b78c00 0000000000000008 ffff8802321686a8
> Call Trace:
> [<ffffffffc07e2622>] ioeventfd_destructor+0x12/0x20 [kvm]
> [<ffffffffc07df69a>] kvm_put_kvm+0xca/0x210 [kvm]
> [<ffffffffc07df818>] kvm_vcpu_release+0x18/0x20 [kvm]
> [<ffffffff811f69f7>] __fput+0xe7/0x250
> [<ffffffff811f6bae>] ____fput+0xe/0x10
> [<ffffffff81093f04>] task_work_run+0xd4/0xf0
> [<ffffffff81079358>] do_exit+0x368/0xa50
> [<ffffffff81082c8f>] ? recalc_sigpending+0x1f/0x60
> [<ffffffff81079ad5>] do_group_exit+0x45/0xb0
> [<ffffffff81085c71>] get_signal+0x291/0x750
> [<ffffffff810144d8>] do_signal+0x28/0xab0
> [<ffffffff810f3a3b>] ? do_futex+0xdb/0x5d0
> [<ffffffff810b7028>] ? __wake_up_locked_key+0x18/0x20
> [<ffffffff810f3fa6>] ? SyS_futex+0x76/0x170
> [<ffffffff81014fc9>] do_notify_resume+0x69/0xb0
> [<ffffffff817cb9af>] int_signal+0x12/0x17
> Code: 5d c3 90 0f 1f 44 00 00 55 48 89 e5 53 48 89 fb 48 83 ec 08 48 8b 7f 20 e8 06 d6 a5 c0 48 8b 43 08 48 8b 13 48 89 df 48 89 42 08 <48> 89 10 48 b8 00 01 10 00 00
> RIP  [<ffffffffc07e25d8>] ioeventfd_release+0x28/0x60 [kvm]
> RSP <ffff88020e7f3bc8>
> 
> Cc: Gleb Natapov <gleb@kernel.org>
> Cc: Paolo Bonzini <pbonzini@redhat.com>
> Cc: Michael S. Tsirkin <mst@redhat.com>
> Signed-off-by: Jason Wang <jasowang@redhat.com>

Saw v3 too late. Pls see my comments on v2.

> ---
> Changes from V2:
> - Tweak styles and comment suggested by Cornelia.
> Changes from v1:
> - change ioeventfd_bus_from_flags() to return KVM_FAST_MMIO_BUS when
>   needed to save lots of unnecessary changes.
> ---
>  virt/kvm/eventfd.c  | 31 +++++++++----------------------
>  virt/kvm/kvm_main.c | 16 ++++++++++++++--
>  2 files changed, 23 insertions(+), 24 deletions(-)
> 
> diff --git a/virt/kvm/eventfd.c b/virt/kvm/eventfd.c
> index 9ff4193..c3ffdc3 100644
> --- a/virt/kvm/eventfd.c
> +++ b/virt/kvm/eventfd.c
> @@ -762,13 +762,16 @@ ioeventfd_check_collision(struct kvm *kvm, struct _ioeventfd *p)
>  	return false;
>  }
>  
> -static enum kvm_bus ioeventfd_bus_from_flags(__u32 flags)
> +static enum kvm_bus ioeventfd_bus_from_args(struct kvm_ioeventfd *args)
>  {
> -	if (flags & KVM_IOEVENTFD_FLAG_PIO)
> +	if (args->flags & KVM_IOEVENTFD_FLAG_PIO)
>  		return KVM_PIO_BUS;
> -	if (flags & KVM_IOEVENTFD_FLAG_VIRTIO_CCW_NOTIFY)
> +	if (args->flags & KVM_IOEVENTFD_FLAG_VIRTIO_CCW_NOTIFY)
>  		return KVM_VIRTIO_CCW_NOTIFY_BUS;
> -	return KVM_MMIO_BUS;
> +	/* When length is ignored, MMIO is put on a separate bus, for
> +	 * faster lookups.
> +	 */
> +	return args->len ? KVM_MMIO_BUS : KVM_FAST_MMIO_BUS;
>  }
>  
>  static int
> @@ -779,7 +782,7 @@ kvm_assign_ioeventfd(struct kvm *kvm, struct kvm_ioeventfd *args)
>  	struct eventfd_ctx       *eventfd;
>  	int                       ret;
>  
> -	bus_idx = ioeventfd_bus_from_flags(args->flags);
> +	bus_idx = ioeventfd_bus_from_args(args);
>  	/* must be natural-word sized, or 0 to ignore length */
>  	switch (args->len) {
>  	case 0:
> @@ -843,16 +846,6 @@ kvm_assign_ioeventfd(struct kvm *kvm, struct kvm_ioeventfd *args)
>  	if (ret < 0)
>  		goto unlock_fail;
>  
> -	/* When length is ignored, MMIO is also put on a separate bus, for
> -	 * faster lookups.
> -	 */
> -	if (!args->len && !(args->flags & KVM_IOEVENTFD_FLAG_PIO)) {
> -		ret = kvm_io_bus_register_dev(kvm, KVM_FAST_MMIO_BUS,
> -					      p->addr, 0, &p->dev);
> -		if (ret < 0)
> -			goto register_fail;
> -	}
> -
>  	kvm->buses[bus_idx]->ioeventfd_count++;
>  	list_add_tail(&p->list, &kvm->ioeventfds);
>  
> @@ -860,8 +853,6 @@ kvm_assign_ioeventfd(struct kvm *kvm, struct kvm_ioeventfd *args)
>  
>  	return 0;
>  
> -register_fail:
> -	kvm_io_bus_unregister_dev(kvm, bus_idx, &p->dev);
>  unlock_fail:
>  	mutex_unlock(&kvm->slots_lock);
>  
> @@ -880,7 +871,7 @@ kvm_deassign_ioeventfd(struct kvm *kvm, struct kvm_ioeventfd *args)
>  	struct eventfd_ctx       *eventfd;
>  	int                       ret = -ENOENT;
>  
> -	bus_idx = ioeventfd_bus_from_flags(args->flags);
> +	bus_idx = ioeventfd_bus_from_args(args);
>  	eventfd = eventfd_ctx_fdget(args->fd);
>  	if (IS_ERR(eventfd))
>  		return PTR_ERR(eventfd);
> @@ -901,10 +892,6 @@ kvm_deassign_ioeventfd(struct kvm *kvm, struct kvm_ioeventfd *args)
>  			continue;
>  
>  		kvm_io_bus_unregister_dev(kvm, bus_idx, &p->dev);
> -		if (!p->length) {
> -			kvm_io_bus_unregister_dev(kvm, KVM_FAST_MMIO_BUS,
> -						  &p->dev);
> -		}
>  		kvm->buses[bus_idx]->ioeventfd_count--;
>  		ioeventfd_release(p);
>  		ret = 0;
> diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c
> index 0d79fe8..3e3b3bc 100644
> --- a/virt/kvm/kvm_main.c
> +++ b/virt/kvm/kvm_main.c
> @@ -3152,8 +3152,8 @@ static int __kvm_io_bus_write(struct kvm_vcpu *vcpu, struct kvm_io_bus *bus,
>  }
>  
>  /* kvm_io_bus_write - called under kvm->slots_lock */
> -int kvm_io_bus_write(struct kvm_vcpu *vcpu, enum kvm_bus bus_idx, gpa_t addr,
> -		     int len, const void *val)
> +static int kvm_io_bus_write_idx(struct kvm_vcpu *vcpu, enum kvm_bus bus_idx,
> +				gpa_t addr, int len, const void *val)
>  {
>  	struct kvm_io_bus *bus;
>  	struct kvm_io_range range;
> @@ -3169,6 +3169,18 @@ int kvm_io_bus_write(struct kvm_vcpu *vcpu, enum kvm_bus bus_idx, gpa_t addr,
>  	return r < 0 ? r : 0;
>  }
>  
> +int kvm_io_bus_write(struct kvm_vcpu *vcpu, enum kvm_bus bus_idx, gpa_t addr,
> +		     int len, const void *val)
> +{
> +	int r = 0;
> +
> +	if (bus_idx != KVM_MMIO_BUS ||
> +	    kvm_io_bus_write_idx(vcpu, KVM_FAST_MMIO_BUS, addr, 0, NULL) < 0)
> +		r = kvm_io_bus_write_idx(vcpu, bus_idx, addr, len, val);
> +
> +	return r < 0 ? r : 0;
> +}
> +
>  /* kvm_io_bus_write_cookie - called under kvm->slots_lock */
>  int kvm_io_bus_write_cookie(struct kvm_vcpu *vcpu, enum kvm_bus bus_idx,
>  			    gpa_t addr, int len, const void *val, long cookie)
> -- 
> 2.1.4

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH V3 2/3] kvm: don't register wildcard MMIO EVENTFD on two buses
  2015-08-25  9:05 ` [PATCH V3 2/3] kvm: don't register wildcard MMIO EVENTFD on two buses Jason Wang
  2015-08-25 10:24   ` Cornelia Huck
  2015-08-25 11:37   ` Michael S. Tsirkin
@ 2015-08-25 11:51   ` Michael S. Tsirkin
  2015-08-26  5:10     ` Jason Wang
  2 siblings, 1 reply; 19+ messages in thread
From: Michael S. Tsirkin @ 2015-08-25 11:51 UTC (permalink / raw)
  To: Jason Wang; +Cc: gleb, pbonzini, kvm, linux-kernel, cornelia.huck

On Tue, Aug 25, 2015 at 05:05:47PM +0800, Jason Wang wrote:
> We register wildcard mmio eventfd on two buses, one for KVM_MMIO_BUS
> and another is KVM_FAST_MMIO_BUS. This leads to issue:
> 
> - kvm_io_bus_destroy() knows nothing about the devices on two buses
>   points to a single dev. Which will lead double free [1] during exit.
> - wildcard eventfd ignores data len, so it was registered as a
>   kvm_io_range with zero length. This will fail the binary search in
>   kvm_io_bus_get_first_dev() when we try to emulate through
>   KVM_MMIO_BUS. This will cause userspace io emulation request instead
>   of a eventfd notification (virtqueue kick will be trapped by qemu
>   instead of vhost in this case).
> 
> Fixing this by don't register wildcard mmio eventfd on two
> buses. Instead, only register it in KVM_FAST_MMIO_BUS. This fixes the
> double free issue of kvm_io_bus_destroy(). For the arch/setups that
> does not utilize KVM_FAST_MMIO_BUS, before searching KVM_MMIO_BUS, try
> KVM_FAST_MMIO_BUS first to see it it has a match.
> 
> [1] Panic caused by double free:
> 
> CPU: 1 PID: 2894 Comm: qemu-system-x86 Not tainted 3.19.0-26-generic #28-Ubuntu
> Hardware name: LENOVO 2356BG6/2356BG6, BIOS G7ET96WW (2.56 ) 09/12/2013
> task: ffff88009ae0c4b0 ti: ffff88020e7f0000 task.ti: ffff88020e7f0000
> RIP: 0010:[<ffffffffc07e25d8>]  [<ffffffffc07e25d8>] ioeventfd_release+0x28/0x60 [kvm]
> RSP: 0018:ffff88020e7f3bc8  EFLAGS: 00010292
> RAX: dead000000200200 RBX: ffff8801ec19c900 RCX: 000000018200016d
> RDX: ffff8801ec19cf80 RSI: ffffea0008bf1d40 RDI: ffff8801ec19c900
> RBP: ffff88020e7f3bd8 R08: 000000002fc75a01 R09: 000000018200016d
> R10: ffffffffc07df6ae R11: ffff88022fc75a98 R12: ffff88021e7cc000
> R13: ffff88021e7cca48 R14: ffff88021e7cca50 R15: ffff8801ec19c880
> FS:  00007fc1ee3e6700(0000) GS:ffff88023e240000(0000) knlGS:0000000000000000
> CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> CR2: 00007f8f389d8000 CR3: 000000023dc13000 CR4: 00000000001427e0
> Stack:
> ffff88021e7cc000 0000000000000000 ffff88020e7f3be8 ffffffffc07e2622
> ffff88020e7f3c38 ffffffffc07df69a ffff880232524160 ffff88020e792d80
>  0000000000000000 ffff880219b78c00 0000000000000008 ffff8802321686a8
> Call Trace:
> [<ffffffffc07e2622>] ioeventfd_destructor+0x12/0x20 [kvm]
> [<ffffffffc07df69a>] kvm_put_kvm+0xca/0x210 [kvm]
> [<ffffffffc07df818>] kvm_vcpu_release+0x18/0x20 [kvm]
> [<ffffffff811f69f7>] __fput+0xe7/0x250
> [<ffffffff811f6bae>] ____fput+0xe/0x10
> [<ffffffff81093f04>] task_work_run+0xd4/0xf0
> [<ffffffff81079358>] do_exit+0x368/0xa50
> [<ffffffff81082c8f>] ? recalc_sigpending+0x1f/0x60
> [<ffffffff81079ad5>] do_group_exit+0x45/0xb0
> [<ffffffff81085c71>] get_signal+0x291/0x750
> [<ffffffff810144d8>] do_signal+0x28/0xab0
> [<ffffffff810f3a3b>] ? do_futex+0xdb/0x5d0
> [<ffffffff810b7028>] ? __wake_up_locked_key+0x18/0x20
> [<ffffffff810f3fa6>] ? SyS_futex+0x76/0x170
> [<ffffffff81014fc9>] do_notify_resume+0x69/0xb0
> [<ffffffff817cb9af>] int_signal+0x12/0x17
> Code: 5d c3 90 0f 1f 44 00 00 55 48 89 e5 53 48 89 fb 48 83 ec 08 48 8b 7f 20 e8 06 d6 a5 c0 48 8b 43 08 48 8b 13 48 89 df 48 89 42 08 <48> 89 10 48 b8 00 01 10 00 00
> RIP  [<ffffffffc07e25d8>] ioeventfd_release+0x28/0x60 [kvm]
> RSP <ffff88020e7f3bc8>
> 
> Cc: Gleb Natapov <gleb@kernel.org>
> Cc: Paolo Bonzini <pbonzini@redhat.com>
> Cc: Michael S. Tsirkin <mst@redhat.com>
> Signed-off-by: Jason Wang <jasowang@redhat.com>
> ---
> Changes from V2:
> - Tweak styles and comment suggested by Cornelia.
> Changes from v1:
> - change ioeventfd_bus_from_flags() to return KVM_FAST_MMIO_BUS when
>   needed to save lots of unnecessary changes.
> ---
>  virt/kvm/eventfd.c  | 31 +++++++++----------------------
>  virt/kvm/kvm_main.c | 16 ++++++++++++++--
>  2 files changed, 23 insertions(+), 24 deletions(-)
> 
> diff --git a/virt/kvm/eventfd.c b/virt/kvm/eventfd.c
> index 9ff4193..c3ffdc3 100644
> --- a/virt/kvm/eventfd.c
> +++ b/virt/kvm/eventfd.c
> @@ -762,13 +762,16 @@ ioeventfd_check_collision(struct kvm *kvm, struct _ioeventfd *p)
>  	return false;
>  }
>  
> -static enum kvm_bus ioeventfd_bus_from_flags(__u32 flags)
> +static enum kvm_bus ioeventfd_bus_from_args(struct kvm_ioeventfd *args)
>  {
> -	if (flags & KVM_IOEVENTFD_FLAG_PIO)
> +	if (args->flags & KVM_IOEVENTFD_FLAG_PIO)
>  		return KVM_PIO_BUS;
> -	if (flags & KVM_IOEVENTFD_FLAG_VIRTIO_CCW_NOTIFY)
> +	if (args->flags & KVM_IOEVENTFD_FLAG_VIRTIO_CCW_NOTIFY)
>  		return KVM_VIRTIO_CCW_NOTIFY_BUS;
> -	return KVM_MMIO_BUS;
> +	/* When length is ignored, MMIO is put on a separate bus, for
> +	 * faster lookups.
> +	 */
> +	return args->len ? KVM_MMIO_BUS : KVM_FAST_MMIO_BUS;
>  }
>  
>  static int
> @@ -779,7 +782,7 @@ kvm_assign_ioeventfd(struct kvm *kvm, struct kvm_ioeventfd *args)
>  	struct eventfd_ctx       *eventfd;
>  	int                       ret;
>  
> -	bus_idx = ioeventfd_bus_from_flags(args->flags);
> +	bus_idx = ioeventfd_bus_from_args(args);
>  	/* must be natural-word sized, or 0 to ignore length */
>  	switch (args->len) {
>  	case 0:
> @@ -843,16 +846,6 @@ kvm_assign_ioeventfd(struct kvm *kvm, struct kvm_ioeventfd *args)
>  	if (ret < 0)
>  		goto unlock_fail;
>  
> -	/* When length is ignored, MMIO is also put on a separate bus, for
> -	 * faster lookups.
> -	 */
> -	if (!args->len && !(args->flags & KVM_IOEVENTFD_FLAG_PIO)) {
> -		ret = kvm_io_bus_register_dev(kvm, KVM_FAST_MMIO_BUS,
> -					      p->addr, 0, &p->dev);
> -		if (ret < 0)
> -			goto register_fail;
> -	}
> -
>  	kvm->buses[bus_idx]->ioeventfd_count++;
>  	list_add_tail(&p->list, &kvm->ioeventfds);
>  
> @@ -860,8 +853,6 @@ kvm_assign_ioeventfd(struct kvm *kvm, struct kvm_ioeventfd *args)
>  
>  	return 0;
>  
> -register_fail:
> -	kvm_io_bus_unregister_dev(kvm, bus_idx, &p->dev);
>  unlock_fail:
>  	mutex_unlock(&kvm->slots_lock);
>  
> @@ -880,7 +871,7 @@ kvm_deassign_ioeventfd(struct kvm *kvm, struct kvm_ioeventfd *args)
>  	struct eventfd_ctx       *eventfd;
>  	int                       ret = -ENOENT;
>  
> -	bus_idx = ioeventfd_bus_from_flags(args->flags);
> +	bus_idx = ioeventfd_bus_from_args(args);
>  	eventfd = eventfd_ctx_fdget(args->fd);
>  	if (IS_ERR(eventfd))
>  		return PTR_ERR(eventfd);
> @@ -901,10 +892,6 @@ kvm_deassign_ioeventfd(struct kvm *kvm, struct kvm_ioeventfd *args)
>  			continue;
>  
>  		kvm_io_bus_unregister_dev(kvm, bus_idx, &p->dev);
> -		if (!p->length) {
> -			kvm_io_bus_unregister_dev(kvm, KVM_FAST_MMIO_BUS,
> -						  &p->dev);
> -		}
>  		kvm->buses[bus_idx]->ioeventfd_count--;
>  		ioeventfd_release(p);
>  		ret = 0;
> diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c
> index 0d79fe8..3e3b3bc 100644
> --- a/virt/kvm/kvm_main.c
> +++ b/virt/kvm/kvm_main.c
> @@ -3152,8 +3152,8 @@ static int __kvm_io_bus_write(struct kvm_vcpu *vcpu, struct kvm_io_bus *bus,
>  }
>  
>  /* kvm_io_bus_write - called under kvm->slots_lock */
> -int kvm_io_bus_write(struct kvm_vcpu *vcpu, enum kvm_bus bus_idx, gpa_t addr,
> -		     int len, const void *val)
> +static int kvm_io_bus_write_idx(struct kvm_vcpu *vcpu, enum kvm_bus bus_idx,
> +				gpa_t addr, int len, const void *val)
>  {
>  	struct kvm_io_bus *bus;
>  	struct kvm_io_range range;
> @@ -3169,6 +3169,18 @@ int kvm_io_bus_write(struct kvm_vcpu *vcpu, enum kvm_bus bus_idx, gpa_t addr,
>  	return r < 0 ? r : 0;
>  }
>  
> +int kvm_io_bus_write(struct kvm_vcpu *vcpu, enum kvm_bus bus_idx, gpa_t addr,
> +		     int len, const void *val)
> +{
> +	int r = 0;
> +
> +	if (bus_idx != KVM_MMIO_BUS ||
> +	    kvm_io_bus_write_idx(vcpu, KVM_FAST_MMIO_BUS, addr, 0, NULL) < 0)
> +		r = kvm_io_bus_write_idx(vcpu, bus_idx, addr, len, val);
> +
> +	return r < 0 ? r : 0;
> +}
> +

Thinking more about this, invoking the 0-length write after
the != 0 length one would be better: it would mean we only
handle the userspace MMIO like this.

All this calls for performance tests using kvm unit tests.

>  /* kvm_io_bus_write_cookie - called under kvm->slots_lock */
>  int kvm_io_bus_write_cookie(struct kvm_vcpu *vcpu, enum kvm_bus bus_idx,
>  			    gpa_t addr, int len, const void *val, long cookie)
> -- 
> 2.1.4

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH V3 2/3] kvm: don't register wildcard MMIO EVENTFD on two buses
  2015-08-25 11:51   ` Michael S. Tsirkin
@ 2015-08-26  5:10     ` Jason Wang
  2015-08-31  3:12       ` Jason Wang
  0 siblings, 1 reply; 19+ messages in thread
From: Jason Wang @ 2015-08-26  5:10 UTC (permalink / raw)
  To: Michael S. Tsirkin; +Cc: gleb, pbonzini, kvm, linux-kernel, cornelia.huck



On 08/25/2015 07:51 PM, Michael S. Tsirkin wrote:
> On Tue, Aug 25, 2015 at 05:05:47PM +0800, Jason Wang wrote:
>> > We register wildcard mmio eventfd on two buses, one for KVM_MMIO_BUS
>> > and another is KVM_FAST_MMIO_BUS. This leads to issue:
>> > 
>> > - kvm_io_bus_destroy() knows nothing about the devices on two buses
>> >   points to a single dev. Which will lead double free [1] during exit.
>> > - wildcard eventfd ignores data len, so it was registered as a
>> >   kvm_io_range with zero length. This will fail the binary search in
>> >   kvm_io_bus_get_first_dev() when we try to emulate through
>> >   KVM_MMIO_BUS. This will cause userspace io emulation request instead
>> >   of a eventfd notification (virtqueue kick will be trapped by qemu
>> >   instead of vhost in this case).
>> > 
>> > Fixing this by don't register wildcard mmio eventfd on two
>> > buses. Instead, only register it in KVM_FAST_MMIO_BUS. This fixes the
>> > double free issue of kvm_io_bus_destroy(). For the arch/setups that
>> > does not utilize KVM_FAST_MMIO_BUS, before searching KVM_MMIO_BUS, try
>> > KVM_FAST_MMIO_BUS first to see it it has a match.
>> > 
>> > [1] Panic caused by double free:
>> > 
>> > CPU: 1 PID: 2894 Comm: qemu-system-x86 Not tainted 3.19.0-26-generic #28-Ubuntu
>> > Hardware name: LENOVO 2356BG6/2356BG6, BIOS G7ET96WW (2.56 ) 09/12/2013
>> > task: ffff88009ae0c4b0 ti: ffff88020e7f0000 task.ti: ffff88020e7f0000
>> > RIP: 0010:[<ffffffffc07e25d8>]  [<ffffffffc07e25d8>] ioeventfd_release+0x28/0x60 [kvm]
>> > RSP: 0018:ffff88020e7f3bc8  EFLAGS: 00010292
>> > RAX: dead000000200200 RBX: ffff8801ec19c900 RCX: 000000018200016d
>> > RDX: ffff8801ec19cf80 RSI: ffffea0008bf1d40 RDI: ffff8801ec19c900
>> > RBP: ffff88020e7f3bd8 R08: 000000002fc75a01 R09: 000000018200016d
>> > R10: ffffffffc07df6ae R11: ffff88022fc75a98 R12: ffff88021e7cc000
>> > R13: ffff88021e7cca48 R14: ffff88021e7cca50 R15: ffff8801ec19c880
>> > FS:  00007fc1ee3e6700(0000) GS:ffff88023e240000(0000) knlGS:0000000000000000
>> > CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
>> > CR2: 00007f8f389d8000 CR3: 000000023dc13000 CR4: 00000000001427e0
>> > Stack:
>> > ffff88021e7cc000 0000000000000000 ffff88020e7f3be8 ffffffffc07e2622
>> > ffff88020e7f3c38 ffffffffc07df69a ffff880232524160 ffff88020e792d80
>> >  0000000000000000 ffff880219b78c00 0000000000000008 ffff8802321686a8
>> > Call Trace:
>> > [<ffffffffc07e2622>] ioeventfd_destructor+0x12/0x20 [kvm]
>> > [<ffffffffc07df69a>] kvm_put_kvm+0xca/0x210 [kvm]
>> > [<ffffffffc07df818>] kvm_vcpu_release+0x18/0x20 [kvm]
>> > [<ffffffff811f69f7>] __fput+0xe7/0x250
>> > [<ffffffff811f6bae>] ____fput+0xe/0x10
>> > [<ffffffff81093f04>] task_work_run+0xd4/0xf0
>> > [<ffffffff81079358>] do_exit+0x368/0xa50
>> > [<ffffffff81082c8f>] ? recalc_sigpending+0x1f/0x60
>> > [<ffffffff81079ad5>] do_group_exit+0x45/0xb0
>> > [<ffffffff81085c71>] get_signal+0x291/0x750
>> > [<ffffffff810144d8>] do_signal+0x28/0xab0
>> > [<ffffffff810f3a3b>] ? do_futex+0xdb/0x5d0
>> > [<ffffffff810b7028>] ? __wake_up_locked_key+0x18/0x20
>> > [<ffffffff810f3fa6>] ? SyS_futex+0x76/0x170
>> > [<ffffffff81014fc9>] do_notify_resume+0x69/0xb0
>> > [<ffffffff817cb9af>] int_signal+0x12/0x17
>> > Code: 5d c3 90 0f 1f 44 00 00 55 48 89 e5 53 48 89 fb 48 83 ec 08 48 8b 7f 20 e8 06 d6 a5 c0 48 8b 43 08 48 8b 13 48 89 df 48 89 42 08 <48> 89 10 48 b8 00 01 10 00 00
>> > RIP  [<ffffffffc07e25d8>] ioeventfd_release+0x28/0x60 [kvm]
>> > RSP <ffff88020e7f3bc8>
>> > 
>> > Cc: Gleb Natapov <gleb@kernel.org>
>> > Cc: Paolo Bonzini <pbonzini@redhat.com>
>> > Cc: Michael S. Tsirkin <mst@redhat.com>
>> > Signed-off-by: Jason Wang <jasowang@redhat.com>
>> > ---
>> > Changes from V2:
>> > - Tweak styles and comment suggested by Cornelia.
>> > Changes from v1:
>> > - change ioeventfd_bus_from_flags() to return KVM_FAST_MMIO_BUS when
>> >   needed to save lots of unnecessary changes.
>> > ---
>> >  virt/kvm/eventfd.c  | 31 +++++++++----------------------
>> >  virt/kvm/kvm_main.c | 16 ++++++++++++++--
>> >  2 files changed, 23 insertions(+), 24 deletions(-)
>> > 
>> > diff --git a/virt/kvm/eventfd.c b/virt/kvm/eventfd.c
>> > index 9ff4193..c3ffdc3 100644
>> > --- a/virt/kvm/eventfd.c
>> > +++ b/virt/kvm/eventfd.c
>> > @@ -762,13 +762,16 @@ ioeventfd_check_collision(struct kvm *kvm, struct _ioeventfd *p)
>> >  	return false;
>> >  }
>> >  
>> > -static enum kvm_bus ioeventfd_bus_from_flags(__u32 flags)
>> > +static enum kvm_bus ioeventfd_bus_from_args(struct kvm_ioeventfd *args)
>> >  {
>> > -	if (flags & KVM_IOEVENTFD_FLAG_PIO)
>> > +	if (args->flags & KVM_IOEVENTFD_FLAG_PIO)
>> >  		return KVM_PIO_BUS;
>> > -	if (flags & KVM_IOEVENTFD_FLAG_VIRTIO_CCW_NOTIFY)
>> > +	if (args->flags & KVM_IOEVENTFD_FLAG_VIRTIO_CCW_NOTIFY)
>> >  		return KVM_VIRTIO_CCW_NOTIFY_BUS;
>> > -	return KVM_MMIO_BUS;
>> > +	/* When length is ignored, MMIO is put on a separate bus, for
>> > +	 * faster lookups.
>> > +	 */
>> > +	return args->len ? KVM_MMIO_BUS : KVM_FAST_MMIO_BUS;
>> >  }
>> >  
>> >  static int
>> > @@ -779,7 +782,7 @@ kvm_assign_ioeventfd(struct kvm *kvm, struct kvm_ioeventfd *args)
>> >  	struct eventfd_ctx       *eventfd;
>> >  	int                       ret;
>> >  
>> > -	bus_idx = ioeventfd_bus_from_flags(args->flags);
>> > +	bus_idx = ioeventfd_bus_from_args(args);
>> >  	/* must be natural-word sized, or 0 to ignore length */
>> >  	switch (args->len) {
>> >  	case 0:
>> > @@ -843,16 +846,6 @@ kvm_assign_ioeventfd(struct kvm *kvm, struct kvm_ioeventfd *args)
>> >  	if (ret < 0)
>> >  		goto unlock_fail;
>> >  
>> > -	/* When length is ignored, MMIO is also put on a separate bus, for
>> > -	 * faster lookups.
>> > -	 */
>> > -	if (!args->len && !(args->flags & KVM_IOEVENTFD_FLAG_PIO)) {
>> > -		ret = kvm_io_bus_register_dev(kvm, KVM_FAST_MMIO_BUS,
>> > -					      p->addr, 0, &p->dev);
>> > -		if (ret < 0)
>> > -			goto register_fail;
>> > -	}
>> > -
>> >  	kvm->buses[bus_idx]->ioeventfd_count++;
>> >  	list_add_tail(&p->list, &kvm->ioeventfds);
>> >  
>> > @@ -860,8 +853,6 @@ kvm_assign_ioeventfd(struct kvm *kvm, struct kvm_ioeventfd *args)
>> >  
>> >  	return 0;
>> >  
>> > -register_fail:
>> > -	kvm_io_bus_unregister_dev(kvm, bus_idx, &p->dev);
>> >  unlock_fail:
>> >  	mutex_unlock(&kvm->slots_lock);
>> >  
>> > @@ -880,7 +871,7 @@ kvm_deassign_ioeventfd(struct kvm *kvm, struct kvm_ioeventfd *args)
>> >  	struct eventfd_ctx       *eventfd;
>> >  	int                       ret = -ENOENT;
>> >  
>> > -	bus_idx = ioeventfd_bus_from_flags(args->flags);
>> > +	bus_idx = ioeventfd_bus_from_args(args);
>> >  	eventfd = eventfd_ctx_fdget(args->fd);
>> >  	if (IS_ERR(eventfd))
>> >  		return PTR_ERR(eventfd);
>> > @@ -901,10 +892,6 @@ kvm_deassign_ioeventfd(struct kvm *kvm, struct kvm_ioeventfd *args)
>> >  			continue;
>> >  
>> >  		kvm_io_bus_unregister_dev(kvm, bus_idx, &p->dev);
>> > -		if (!p->length) {
>> > -			kvm_io_bus_unregister_dev(kvm, KVM_FAST_MMIO_BUS,
>> > -						  &p->dev);
>> > -		}
>> >  		kvm->buses[bus_idx]->ioeventfd_count--;
>> >  		ioeventfd_release(p);
>> >  		ret = 0;
>> > diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c
>> > index 0d79fe8..3e3b3bc 100644
>> > --- a/virt/kvm/kvm_main.c
>> > +++ b/virt/kvm/kvm_main.c
>> > @@ -3152,8 +3152,8 @@ static int __kvm_io_bus_write(struct kvm_vcpu *vcpu, struct kvm_io_bus *bus,
>> >  }
>> >  
>> >  /* kvm_io_bus_write - called under kvm->slots_lock */
>> > -int kvm_io_bus_write(struct kvm_vcpu *vcpu, enum kvm_bus bus_idx, gpa_t addr,
>> > -		     int len, const void *val)
>> > +static int kvm_io_bus_write_idx(struct kvm_vcpu *vcpu, enum kvm_bus bus_idx,
>> > +				gpa_t addr, int len, const void *val)
>> >  {
>> >  	struct kvm_io_bus *bus;
>> >  	struct kvm_io_range range;
>> > @@ -3169,6 +3169,18 @@ int kvm_io_bus_write(struct kvm_vcpu *vcpu, enum kvm_bus bus_idx, gpa_t addr,
>> >  	return r < 0 ? r : 0;
>> >  }
>> >  
>> > +int kvm_io_bus_write(struct kvm_vcpu *vcpu, enum kvm_bus bus_idx, gpa_t addr,
>> > +		     int len, const void *val)
>> > +{
>> > +	int r = 0;
>> > +
>> > +	if (bus_idx != KVM_MMIO_BUS ||
>> > +	    kvm_io_bus_write_idx(vcpu, KVM_FAST_MMIO_BUS, addr, 0, NULL) < 0)
>> > +		r = kvm_io_bus_write_idx(vcpu, bus_idx, addr, len, val);
>> > +
>> > +	return r < 0 ? r : 0;
>> > +}
>> > +
> Thinking more about this, invoking the 0-length write after
> the != 0 length one would be better: it would mean we only
> handle the userspace MMIO like this.

Right.

>
> All this calls for performance tests using kvm unit tests.
>

Ok, so if no obvious performance changes, do you agree this approach or
you still want to for two event fds?

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH V3 2/3] kvm: don't register wildcard MMIO EVENTFD on two buses
  2015-08-26  5:10     ` Jason Wang
@ 2015-08-31  3:12       ` Jason Wang
  2015-08-31  7:29         ` Michael S. Tsirkin
  0 siblings, 1 reply; 19+ messages in thread
From: Jason Wang @ 2015-08-31  3:12 UTC (permalink / raw)
  To: Michael S. Tsirkin; +Cc: gleb, pbonzini, kvm, linux-kernel, cornelia.huck



On 08/26/2015 01:10 PM, Jason Wang wrote:
> On 08/25/2015 07:51 PM, Michael S. Tsirkin wrote:
>> > On Tue, Aug 25, 2015 at 05:05:47PM +0800, Jason Wang wrote:
>>>> >> > We register wildcard mmio eventfd on two buses, one for KVM_MMIO_BUS
>>>> >> > and another is KVM_FAST_MMIO_BUS. This leads to issue:
>>>> >> > 
>>>> >> > - kvm_io_bus_destroy() knows nothing about the devices on two buses
>>>> >> >   points to a single dev. Which will lead double free [1] during exit.
>>>> >> > - wildcard eventfd ignores data len, so it was registered as a
>>>> >> >   kvm_io_range with zero length. This will fail the binary search in
>>>> >> >   kvm_io_bus_get_first_dev() when we try to emulate through
>>>> >> >   KVM_MMIO_BUS. This will cause userspace io emulation request instead
>>>> >> >   of a eventfd notification (virtqueue kick will be trapped by qemu
>>>> >> >   instead of vhost in this case).
>>>> >> > 
>>>> >> > Fixing this by don't register wildcard mmio eventfd on two
>>>> >> > buses. Instead, only register it in KVM_FAST_MMIO_BUS. This fixes the
>>>> >> > double free issue of kvm_io_bus_destroy(). For the arch/setups that
>>>> >> > does not utilize KVM_FAST_MMIO_BUS, before searching KVM_MMIO_BUS, try
>>>> >> > KVM_FAST_MMIO_BUS first to see it it has a match.
>>>> >> > 
>>>> >> > [1] Panic caused by double free:
>>>> >> > 
>>>> >> > CPU: 1 PID: 2894 Comm: qemu-system-x86 Not tainted 3.19.0-26-generic #28-Ubuntu
>>>> >> > Hardware name: LENOVO 2356BG6/2356BG6, BIOS G7ET96WW (2.56 ) 09/12/2013
>>>> >> > task: ffff88009ae0c4b0 ti: ffff88020e7f0000 task.ti: ffff88020e7f0000
>>>> >> > RIP: 0010:[<ffffffffc07e25d8>]  [<ffffffffc07e25d8>] ioeventfd_release+0x28/0x60 [kvm]
>>>> >> > RSP: 0018:ffff88020e7f3bc8  EFLAGS: 00010292
>>>> >> > RAX: dead000000200200 RBX: ffff8801ec19c900 RCX: 000000018200016d
>>>> >> > RDX: ffff8801ec19cf80 RSI: ffffea0008bf1d40 RDI: ffff8801ec19c900
>>>> >> > RBP: ffff88020e7f3bd8 R08: 000000002fc75a01 R09: 000000018200016d
>>>> >> > R10: ffffffffc07df6ae R11: ffff88022fc75a98 R12: ffff88021e7cc000
>>>> >> > R13: ffff88021e7cca48 R14: ffff88021e7cca50 R15: ffff8801ec19c880
>>>> >> > FS:  00007fc1ee3e6700(0000) GS:ffff88023e240000(0000) knlGS:0000000000000000
>>>> >> > CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
>>>> >> > CR2: 00007f8f389d8000 CR3: 000000023dc13000 CR4: 00000000001427e0
>>>> >> > Stack:
>>>> >> > ffff88021e7cc000 0000000000000000 ffff88020e7f3be8 ffffffffc07e2622
>>>> >> > ffff88020e7f3c38 ffffffffc07df69a ffff880232524160 ffff88020e792d80
>>>> >> >  0000000000000000 ffff880219b78c00 0000000000000008 ffff8802321686a8
>>>> >> > Call Trace:
>>>> >> > [<ffffffffc07e2622>] ioeventfd_destructor+0x12/0x20 [kvm]
>>>> >> > [<ffffffffc07df69a>] kvm_put_kvm+0xca/0x210 [kvm]
>>>> >> > [<ffffffffc07df818>] kvm_vcpu_release+0x18/0x20 [kvm]
>>>> >> > [<ffffffff811f69f7>] __fput+0xe7/0x250
>>>> >> > [<ffffffff811f6bae>] ____fput+0xe/0x10
>>>> >> > [<ffffffff81093f04>] task_work_run+0xd4/0xf0
>>>> >> > [<ffffffff81079358>] do_exit+0x368/0xa50
>>>> >> > [<ffffffff81082c8f>] ? recalc_sigpending+0x1f/0x60
>>>> >> > [<ffffffff81079ad5>] do_group_exit+0x45/0xb0
>>>> >> > [<ffffffff81085c71>] get_signal+0x291/0x750
>>>> >> > [<ffffffff810144d8>] do_signal+0x28/0xab0
>>>> >> > [<ffffffff810f3a3b>] ? do_futex+0xdb/0x5d0
>>>> >> > [<ffffffff810b7028>] ? __wake_up_locked_key+0x18/0x20
>>>> >> > [<ffffffff810f3fa6>] ? SyS_futex+0x76/0x170
>>>> >> > [<ffffffff81014fc9>] do_notify_resume+0x69/0xb0
>>>> >> > [<ffffffff817cb9af>] int_signal+0x12/0x17
>>>> >> > Code: 5d c3 90 0f 1f 44 00 00 55 48 89 e5 53 48 89 fb 48 83 ec 08 48 8b 7f 20 e8 06 d6 a5 c0 48 8b 43 08 48 8b 13 48 89 df 48 89 42 08 <48> 89 10 48 b8 00 01 10 00 00
>>>> >> > RIP  [<ffffffffc07e25d8>] ioeventfd_release+0x28/0x60 [kvm]
>>>> >> > RSP <ffff88020e7f3bc8>
>>>> >> > 
>>>> >> > Cc: Gleb Natapov <gleb@kernel.org>
>>>> >> > Cc: Paolo Bonzini <pbonzini@redhat.com>
>>>> >> > Cc: Michael S. Tsirkin <mst@redhat.com>
>>>> >> > Signed-off-by: Jason Wang <jasowang@redhat.com>
>>>> >> > ---
>>>> >> > Changes from V2:
>>>> >> > - Tweak styles and comment suggested by Cornelia.
>>>> >> > Changes from v1:
>>>> >> > - change ioeventfd_bus_from_flags() to return KVM_FAST_MMIO_BUS when
>>>> >> >   needed to save lots of unnecessary changes.
>>>> >> > ---
>>>> >> >  virt/kvm/eventfd.c  | 31 +++++++++----------------------
>>>> >> >  virt/kvm/kvm_main.c | 16 ++++++++++++++--
>>>> >> >  2 files changed, 23 insertions(+), 24 deletions(-)
>>>> >> > 
>>>> >> > diff --git a/virt/kvm/eventfd.c b/virt/kvm/eventfd.c
>>>> >> > index 9ff4193..c3ffdc3 100644
>>>> >> > --- a/virt/kvm/eventfd.c
>>>> >> > +++ b/virt/kvm/eventfd.c
>>>> >> > @@ -762,13 +762,16 @@ ioeventfd_check_collision(struct kvm *kvm, struct _ioeventfd *p)
>>>> >> >  	return false;
>>>> >> >  }
>>>> >> >  
>>>> >> > -static enum kvm_bus ioeventfd_bus_from_flags(__u32 flags)
>>>> >> > +static enum kvm_bus ioeventfd_bus_from_args(struct kvm_ioeventfd *args)
>>>> >> >  {
>>>> >> > -	if (flags & KVM_IOEVENTFD_FLAG_PIO)
>>>> >> > +	if (args->flags & KVM_IOEVENTFD_FLAG_PIO)
>>>> >> >  		return KVM_PIO_BUS;
>>>> >> > -	if (flags & KVM_IOEVENTFD_FLAG_VIRTIO_CCW_NOTIFY)
>>>> >> > +	if (args->flags & KVM_IOEVENTFD_FLAG_VIRTIO_CCW_NOTIFY)
>>>> >> >  		return KVM_VIRTIO_CCW_NOTIFY_BUS;
>>>> >> > -	return KVM_MMIO_BUS;
>>>> >> > +	/* When length is ignored, MMIO is put on a separate bus, for
>>>> >> > +	 * faster lookups.
>>>> >> > +	 */
>>>> >> > +	return args->len ? KVM_MMIO_BUS : KVM_FAST_MMIO_BUS;
>>>> >> >  }
>>>> >> >  
>>>> >> >  static int
>>>> >> > @@ -779,7 +782,7 @@ kvm_assign_ioeventfd(struct kvm *kvm, struct kvm_ioeventfd *args)
>>>> >> >  	struct eventfd_ctx       *eventfd;
>>>> >> >  	int                       ret;
>>>> >> >  
>>>> >> > -	bus_idx = ioeventfd_bus_from_flags(args->flags);
>>>> >> > +	bus_idx = ioeventfd_bus_from_args(args);
>>>> >> >  	/* must be natural-word sized, or 0 to ignore length */
>>>> >> >  	switch (args->len) {
>>>> >> >  	case 0:
>>>> >> > @@ -843,16 +846,6 @@ kvm_assign_ioeventfd(struct kvm *kvm, struct kvm_ioeventfd *args)
>>>> >> >  	if (ret < 0)
>>>> >> >  		goto unlock_fail;
>>>> >> >  
>>>> >> > -	/* When length is ignored, MMIO is also put on a separate bus, for
>>>> >> > -	 * faster lookups.
>>>> >> > -	 */
>>>> >> > -	if (!args->len && !(args->flags & KVM_IOEVENTFD_FLAG_PIO)) {
>>>> >> > -		ret = kvm_io_bus_register_dev(kvm, KVM_FAST_MMIO_BUS,
>>>> >> > -					      p->addr, 0, &p->dev);
>>>> >> > -		if (ret < 0)
>>>> >> > -			goto register_fail;
>>>> >> > -	}
>>>> >> > -
>>>> >> >  	kvm->buses[bus_idx]->ioeventfd_count++;
>>>> >> >  	list_add_tail(&p->list, &kvm->ioeventfds);
>>>> >> >  
>>>> >> > @@ -860,8 +853,6 @@ kvm_assign_ioeventfd(struct kvm *kvm, struct kvm_ioeventfd *args)
>>>> >> >  
>>>> >> >  	return 0;
>>>> >> >  
>>>> >> > -register_fail:
>>>> >> > -	kvm_io_bus_unregister_dev(kvm, bus_idx, &p->dev);
>>>> >> >  unlock_fail:
>>>> >> >  	mutex_unlock(&kvm->slots_lock);
>>>> >> >  
>>>> >> > @@ -880,7 +871,7 @@ kvm_deassign_ioeventfd(struct kvm *kvm, struct kvm_ioeventfd *args)
>>>> >> >  	struct eventfd_ctx       *eventfd;
>>>> >> >  	int                       ret = -ENOENT;
>>>> >> >  
>>>> >> > -	bus_idx = ioeventfd_bus_from_flags(args->flags);
>>>> >> > +	bus_idx = ioeventfd_bus_from_args(args);
>>>> >> >  	eventfd = eventfd_ctx_fdget(args->fd);
>>>> >> >  	if (IS_ERR(eventfd))
>>>> >> >  		return PTR_ERR(eventfd);
>>>> >> > @@ -901,10 +892,6 @@ kvm_deassign_ioeventfd(struct kvm *kvm, struct kvm_ioeventfd *args)
>>>> >> >  			continue;
>>>> >> >  
>>>> >> >  		kvm_io_bus_unregister_dev(kvm, bus_idx, &p->dev);
>>>> >> > -		if (!p->length) {
>>>> >> > -			kvm_io_bus_unregister_dev(kvm, KVM_FAST_MMIO_BUS,
>>>> >> > -						  &p->dev);
>>>> >> > -		}
>>>> >> >  		kvm->buses[bus_idx]->ioeventfd_count--;
>>>> >> >  		ioeventfd_release(p);
>>>> >> >  		ret = 0;
>>>> >> > diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c
>>>> >> > index 0d79fe8..3e3b3bc 100644
>>>> >> > --- a/virt/kvm/kvm_main.c
>>>> >> > +++ b/virt/kvm/kvm_main.c
>>>> >> > @@ -3152,8 +3152,8 @@ static int __kvm_io_bus_write(struct kvm_vcpu *vcpu, struct kvm_io_bus *bus,
>>>> >> >  }
>>>> >> >  
>>>> >> >  /* kvm_io_bus_write - called under kvm->slots_lock */
>>>> >> > -int kvm_io_bus_write(struct kvm_vcpu *vcpu, enum kvm_bus bus_idx, gpa_t addr,
>>>> >> > -		     int len, const void *val)
>>>> >> > +static int kvm_io_bus_write_idx(struct kvm_vcpu *vcpu, enum kvm_bus bus_idx,
>>>> >> > +				gpa_t addr, int len, const void *val)
>>>> >> >  {
>>>> >> >  	struct kvm_io_bus *bus;
>>>> >> >  	struct kvm_io_range range;
>>>> >> > @@ -3169,6 +3169,18 @@ int kvm_io_bus_write(struct kvm_vcpu *vcpu, enum kvm_bus bus_idx, gpa_t addr,
>>>> >> >  	return r < 0 ? r : 0;
>>>> >> >  }
>>>> >> >  
>>>> >> > +int kvm_io_bus_write(struct kvm_vcpu *vcpu, enum kvm_bus bus_idx, gpa_t addr,
>>>> >> > +		     int len, const void *val)
>>>> >> > +{
>>>> >> > +	int r = 0;
>>>> >> > +
>>>> >> > +	if (bus_idx != KVM_MMIO_BUS ||
>>>> >> > +	    kvm_io_bus_write_idx(vcpu, KVM_FAST_MMIO_BUS, addr, 0, NULL) < 0)
>>>> >> > +		r = kvm_io_bus_write_idx(vcpu, bus_idx, addr, len, val);
>>>> >> > +
>>>> >> > +	return r < 0 ? r : 0;
>>>> >> > +}
>>>> >> > +
>> > Thinking more about this, invoking the 0-length write after
>> > the != 0 length one would be better: it would mean we only
>> > handle the userspace MMIO like this.
> Right.
>

Using current unittest. This patch is about 2.9% slower than before, and
invoking 0-length write after is still 1.1% slower (mmio-datamatch-eventfd).

/patch/result/-+%/
/base/2957/0/
/V3/3043/+2.9%/
/V3+invoking != 0 length first/2990/+1.1%/

So looks like the best method is not searching KVM_FAST_MMIO_BUS during
KVM_MMIO_BUS. Instead, let userspace to register both datamatch and
wildcard in this case. Does this sound good to you?

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH V3 2/3] kvm: don't register wildcard MMIO EVENTFD on two buses
  2015-08-31  3:12       ` Jason Wang
@ 2015-08-31  7:29         ` Michael S. Tsirkin
  2015-08-31  8:03           ` Jason Wang
  0 siblings, 1 reply; 19+ messages in thread
From: Michael S. Tsirkin @ 2015-08-31  7:29 UTC (permalink / raw)
  To: Jason Wang; +Cc: gleb, pbonzini, kvm, linux-kernel, cornelia.huck

On Mon, Aug 31, 2015 at 11:12:07AM +0800, Jason Wang wrote:
> 
> 
> On 08/26/2015 01:10 PM, Jason Wang wrote:
> > On 08/25/2015 07:51 PM, Michael S. Tsirkin wrote:
> >> > On Tue, Aug 25, 2015 at 05:05:47PM +0800, Jason Wang wrote:
> >>>> >> > We register wildcard mmio eventfd on two buses, one for KVM_MMIO_BUS
> >>>> >> > and another is KVM_FAST_MMIO_BUS. This leads to issue:
> >>>> >> > 
> >>>> >> > - kvm_io_bus_destroy() knows nothing about the devices on two buses
> >>>> >> >   points to a single dev. Which will lead double free [1] during exit.
> >>>> >> > - wildcard eventfd ignores data len, so it was registered as a
> >>>> >> >   kvm_io_range with zero length. This will fail the binary search in
> >>>> >> >   kvm_io_bus_get_first_dev() when we try to emulate through
> >>>> >> >   KVM_MMIO_BUS. This will cause userspace io emulation request instead
> >>>> >> >   of a eventfd notification (virtqueue kick will be trapped by qemu
> >>>> >> >   instead of vhost in this case).
> >>>> >> > 
> >>>> >> > Fixing this by don't register wildcard mmio eventfd on two
> >>>> >> > buses. Instead, only register it in KVM_FAST_MMIO_BUS. This fixes the
> >>>> >> > double free issue of kvm_io_bus_destroy(). For the arch/setups that
> >>>> >> > does not utilize KVM_FAST_MMIO_BUS, before searching KVM_MMIO_BUS, try
> >>>> >> > KVM_FAST_MMIO_BUS first to see it it has a match.
> >>>> >> > 
> >>>> >> > [1] Panic caused by double free:
> >>>> >> > 
> >>>> >> > CPU: 1 PID: 2894 Comm: qemu-system-x86 Not tainted 3.19.0-26-generic #28-Ubuntu
> >>>> >> > Hardware name: LENOVO 2356BG6/2356BG6, BIOS G7ET96WW (2.56 ) 09/12/2013
> >>>> >> > task: ffff88009ae0c4b0 ti: ffff88020e7f0000 task.ti: ffff88020e7f0000
> >>>> >> > RIP: 0010:[<ffffffffc07e25d8>]  [<ffffffffc07e25d8>] ioeventfd_release+0x28/0x60 [kvm]
> >>>> >> > RSP: 0018:ffff88020e7f3bc8  EFLAGS: 00010292
> >>>> >> > RAX: dead000000200200 RBX: ffff8801ec19c900 RCX: 000000018200016d
> >>>> >> > RDX: ffff8801ec19cf80 RSI: ffffea0008bf1d40 RDI: ffff8801ec19c900
> >>>> >> > RBP: ffff88020e7f3bd8 R08: 000000002fc75a01 R09: 000000018200016d
> >>>> >> > R10: ffffffffc07df6ae R11: ffff88022fc75a98 R12: ffff88021e7cc000
> >>>> >> > R13: ffff88021e7cca48 R14: ffff88021e7cca50 R15: ffff8801ec19c880
> >>>> >> > FS:  00007fc1ee3e6700(0000) GS:ffff88023e240000(0000) knlGS:0000000000000000
> >>>> >> > CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> >>>> >> > CR2: 00007f8f389d8000 CR3: 000000023dc13000 CR4: 00000000001427e0
> >>>> >> > Stack:
> >>>> >> > ffff88021e7cc000 0000000000000000 ffff88020e7f3be8 ffffffffc07e2622
> >>>> >> > ffff88020e7f3c38 ffffffffc07df69a ffff880232524160 ffff88020e792d80
> >>>> >> >  0000000000000000 ffff880219b78c00 0000000000000008 ffff8802321686a8
> >>>> >> > Call Trace:
> >>>> >> > [<ffffffffc07e2622>] ioeventfd_destructor+0x12/0x20 [kvm]
> >>>> >> > [<ffffffffc07df69a>] kvm_put_kvm+0xca/0x210 [kvm]
> >>>> >> > [<ffffffffc07df818>] kvm_vcpu_release+0x18/0x20 [kvm]
> >>>> >> > [<ffffffff811f69f7>] __fput+0xe7/0x250
> >>>> >> > [<ffffffff811f6bae>] ____fput+0xe/0x10
> >>>> >> > [<ffffffff81093f04>] task_work_run+0xd4/0xf0
> >>>> >> > [<ffffffff81079358>] do_exit+0x368/0xa50
> >>>> >> > [<ffffffff81082c8f>] ? recalc_sigpending+0x1f/0x60
> >>>> >> > [<ffffffff81079ad5>] do_group_exit+0x45/0xb0
> >>>> >> > [<ffffffff81085c71>] get_signal+0x291/0x750
> >>>> >> > [<ffffffff810144d8>] do_signal+0x28/0xab0
> >>>> >> > [<ffffffff810f3a3b>] ? do_futex+0xdb/0x5d0
> >>>> >> > [<ffffffff810b7028>] ? __wake_up_locked_key+0x18/0x20
> >>>> >> > [<ffffffff810f3fa6>] ? SyS_futex+0x76/0x170
> >>>> >> > [<ffffffff81014fc9>] do_notify_resume+0x69/0xb0
> >>>> >> > [<ffffffff817cb9af>] int_signal+0x12/0x17
> >>>> >> > Code: 5d c3 90 0f 1f 44 00 00 55 48 89 e5 53 48 89 fb 48 83 ec 08 48 8b 7f 20 e8 06 d6 a5 c0 48 8b 43 08 48 8b 13 48 89 df 48 89 42 08 <48> 89 10 48 b8 00 01 10 00 00
> >>>> >> > RIP  [<ffffffffc07e25d8>] ioeventfd_release+0x28/0x60 [kvm]
> >>>> >> > RSP <ffff88020e7f3bc8>
> >>>> >> > 
> >>>> >> > Cc: Gleb Natapov <gleb@kernel.org>
> >>>> >> > Cc: Paolo Bonzini <pbonzini@redhat.com>
> >>>> >> > Cc: Michael S. Tsirkin <mst@redhat.com>
> >>>> >> > Signed-off-by: Jason Wang <jasowang@redhat.com>
> >>>> >> > ---
> >>>> >> > Changes from V2:
> >>>> >> > - Tweak styles and comment suggested by Cornelia.
> >>>> >> > Changes from v1:
> >>>> >> > - change ioeventfd_bus_from_flags() to return KVM_FAST_MMIO_BUS when
> >>>> >> >   needed to save lots of unnecessary changes.
> >>>> >> > ---
> >>>> >> >  virt/kvm/eventfd.c  | 31 +++++++++----------------------
> >>>> >> >  virt/kvm/kvm_main.c | 16 ++++++++++++++--
> >>>> >> >  2 files changed, 23 insertions(+), 24 deletions(-)
> >>>> >> > 
> >>>> >> > diff --git a/virt/kvm/eventfd.c b/virt/kvm/eventfd.c
> >>>> >> > index 9ff4193..c3ffdc3 100644
> >>>> >> > --- a/virt/kvm/eventfd.c
> >>>> >> > +++ b/virt/kvm/eventfd.c
> >>>> >> > @@ -762,13 +762,16 @@ ioeventfd_check_collision(struct kvm *kvm, struct _ioeventfd *p)
> >>>> >> >  	return false;
> >>>> >> >  }
> >>>> >> >  
> >>>> >> > -static enum kvm_bus ioeventfd_bus_from_flags(__u32 flags)
> >>>> >> > +static enum kvm_bus ioeventfd_bus_from_args(struct kvm_ioeventfd *args)
> >>>> >> >  {
> >>>> >> > -	if (flags & KVM_IOEVENTFD_FLAG_PIO)
> >>>> >> > +	if (args->flags & KVM_IOEVENTFD_FLAG_PIO)
> >>>> >> >  		return KVM_PIO_BUS;
> >>>> >> > -	if (flags & KVM_IOEVENTFD_FLAG_VIRTIO_CCW_NOTIFY)
> >>>> >> > +	if (args->flags & KVM_IOEVENTFD_FLAG_VIRTIO_CCW_NOTIFY)
> >>>> >> >  		return KVM_VIRTIO_CCW_NOTIFY_BUS;
> >>>> >> > -	return KVM_MMIO_BUS;
> >>>> >> > +	/* When length is ignored, MMIO is put on a separate bus, for
> >>>> >> > +	 * faster lookups.
> >>>> >> > +	 */
> >>>> >> > +	return args->len ? KVM_MMIO_BUS : KVM_FAST_MMIO_BUS;
> >>>> >> >  }
> >>>> >> >  
> >>>> >> >  static int
> >>>> >> > @@ -779,7 +782,7 @@ kvm_assign_ioeventfd(struct kvm *kvm, struct kvm_ioeventfd *args)
> >>>> >> >  	struct eventfd_ctx       *eventfd;
> >>>> >> >  	int                       ret;
> >>>> >> >  
> >>>> >> > -	bus_idx = ioeventfd_bus_from_flags(args->flags);
> >>>> >> > +	bus_idx = ioeventfd_bus_from_args(args);
> >>>> >> >  	/* must be natural-word sized, or 0 to ignore length */
> >>>> >> >  	switch (args->len) {
> >>>> >> >  	case 0:
> >>>> >> > @@ -843,16 +846,6 @@ kvm_assign_ioeventfd(struct kvm *kvm, struct kvm_ioeventfd *args)
> >>>> >> >  	if (ret < 0)
> >>>> >> >  		goto unlock_fail;
> >>>> >> >  
> >>>> >> > -	/* When length is ignored, MMIO is also put on a separate bus, for
> >>>> >> > -	 * faster lookups.
> >>>> >> > -	 */
> >>>> >> > -	if (!args->len && !(args->flags & KVM_IOEVENTFD_FLAG_PIO)) {
> >>>> >> > -		ret = kvm_io_bus_register_dev(kvm, KVM_FAST_MMIO_BUS,
> >>>> >> > -					      p->addr, 0, &p->dev);
> >>>> >> > -		if (ret < 0)
> >>>> >> > -			goto register_fail;
> >>>> >> > -	}
> >>>> >> > -
> >>>> >> >  	kvm->buses[bus_idx]->ioeventfd_count++;
> >>>> >> >  	list_add_tail(&p->list, &kvm->ioeventfds);
> >>>> >> >  
> >>>> >> > @@ -860,8 +853,6 @@ kvm_assign_ioeventfd(struct kvm *kvm, struct kvm_ioeventfd *args)
> >>>> >> >  
> >>>> >> >  	return 0;
> >>>> >> >  
> >>>> >> > -register_fail:
> >>>> >> > -	kvm_io_bus_unregister_dev(kvm, bus_idx, &p->dev);
> >>>> >> >  unlock_fail:
> >>>> >> >  	mutex_unlock(&kvm->slots_lock);
> >>>> >> >  
> >>>> >> > @@ -880,7 +871,7 @@ kvm_deassign_ioeventfd(struct kvm *kvm, struct kvm_ioeventfd *args)
> >>>> >> >  	struct eventfd_ctx       *eventfd;
> >>>> >> >  	int                       ret = -ENOENT;
> >>>> >> >  
> >>>> >> > -	bus_idx = ioeventfd_bus_from_flags(args->flags);
> >>>> >> > +	bus_idx = ioeventfd_bus_from_args(args);
> >>>> >> >  	eventfd = eventfd_ctx_fdget(args->fd);
> >>>> >> >  	if (IS_ERR(eventfd))
> >>>> >> >  		return PTR_ERR(eventfd);
> >>>> >> > @@ -901,10 +892,6 @@ kvm_deassign_ioeventfd(struct kvm *kvm, struct kvm_ioeventfd *args)
> >>>> >> >  			continue;
> >>>> >> >  
> >>>> >> >  		kvm_io_bus_unregister_dev(kvm, bus_idx, &p->dev);
> >>>> >> > -		if (!p->length) {
> >>>> >> > -			kvm_io_bus_unregister_dev(kvm, KVM_FAST_MMIO_BUS,
> >>>> >> > -						  &p->dev);
> >>>> >> > -		}
> >>>> >> >  		kvm->buses[bus_idx]->ioeventfd_count--;
> >>>> >> >  		ioeventfd_release(p);
> >>>> >> >  		ret = 0;
> >>>> >> > diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c
> >>>> >> > index 0d79fe8..3e3b3bc 100644
> >>>> >> > --- a/virt/kvm/kvm_main.c
> >>>> >> > +++ b/virt/kvm/kvm_main.c
> >>>> >> > @@ -3152,8 +3152,8 @@ static int __kvm_io_bus_write(struct kvm_vcpu *vcpu, struct kvm_io_bus *bus,
> >>>> >> >  }
> >>>> >> >  
> >>>> >> >  /* kvm_io_bus_write - called under kvm->slots_lock */
> >>>> >> > -int kvm_io_bus_write(struct kvm_vcpu *vcpu, enum kvm_bus bus_idx, gpa_t addr,
> >>>> >> > -		     int len, const void *val)
> >>>> >> > +static int kvm_io_bus_write_idx(struct kvm_vcpu *vcpu, enum kvm_bus bus_idx,
> >>>> >> > +				gpa_t addr, int len, const void *val)
> >>>> >> >  {
> >>>> >> >  	struct kvm_io_bus *bus;
> >>>> >> >  	struct kvm_io_range range;
> >>>> >> > @@ -3169,6 +3169,18 @@ int kvm_io_bus_write(struct kvm_vcpu *vcpu, enum kvm_bus bus_idx, gpa_t addr,
> >>>> >> >  	return r < 0 ? r : 0;
> >>>> >> >  }
> >>>> >> >  
> >>>> >> > +int kvm_io_bus_write(struct kvm_vcpu *vcpu, enum kvm_bus bus_idx, gpa_t addr,
> >>>> >> > +		     int len, const void *val)
> >>>> >> > +{
> >>>> >> > +	int r = 0;
> >>>> >> > +
> >>>> >> > +	if (bus_idx != KVM_MMIO_BUS ||
> >>>> >> > +	    kvm_io_bus_write_idx(vcpu, KVM_FAST_MMIO_BUS, addr, 0, NULL) < 0)
> >>>> >> > +		r = kvm_io_bus_write_idx(vcpu, bus_idx, addr, len, val);
> >>>> >> > +
> >>>> >> > +	return r < 0 ? r : 0;
> >>>> >> > +}
> >>>> >> > +
> >> > Thinking more about this, invoking the 0-length write after
> >> > the != 0 length one would be better: it would mean we only
> >> > handle the userspace MMIO like this.
> > Right.
> >
> 
> Using current unittest. This patch is about 2.9% slower than before, and
> invoking 0-length write after is still 1.1% slower (mmio-datamatch-eventfd).
> 
> /patch/result/-+%/
> /base/2957/0/
> /V3/3043/+2.9%/
> /V3+invoking != 0 length first/2990/+1.1%/
> 
> So looks like the best method is not searching KVM_FAST_MMIO_BUS during
> KVM_MMIO_BUS. Instead, let userspace to register both datamatch and
> wildcard in this case. Does this sound good to you?

No - we can't change userspace.

-- 
MST

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH V3 2/3] kvm: don't register wildcard MMIO EVENTFD on two buses
  2015-08-31  7:29         ` Michael S. Tsirkin
@ 2015-08-31  8:03           ` Jason Wang
  2015-08-31 11:33             ` Michael S. Tsirkin
  0 siblings, 1 reply; 19+ messages in thread
From: Jason Wang @ 2015-08-31  8:03 UTC (permalink / raw)
  To: Michael S. Tsirkin; +Cc: gleb, pbonzini, kvm, linux-kernel, cornelia.huck



On 08/31/2015 03:29 PM, Michael S. Tsirkin wrote:
>>>>> Thinking more about this, invoking the 0-length write after
>>>>> > >> > the != 0 length one would be better: it would mean we only
>>>>> > >> > handle the userspace MMIO like this.
>>> > > Right.
>>> > >
>> > 
>> > Using current unittest. This patch is about 2.9% slower than before, and
>> > invoking 0-length write after is still 1.1% slower (mmio-datamatch-eventfd).
>> > 
>> > /patch/result/-+%/
>> > /base/2957/0/
>> > /V3/3043/+2.9%/
>> > /V3+invoking != 0 length first/2990/+1.1%/
>> > 
>> > So looks like the best method is not searching KVM_FAST_MMIO_BUS during
>> > KVM_MMIO_BUS. Instead, let userspace to register both datamatch and
>> > wildcard in this case. Does this sound good to you?
> No - we can't change userspace.

Actually, the change was as simple as following. So I don't get the
reason why.

diff --git a/hw/virtio/virtio-pci.c b/hw/virtio/virtio-pci.c
index 9935029..42ee986 100644
--- a/hw/virtio/virtio-pci.c
+++ b/hw/virtio/virtio-pci.c
@@ -288,6 +288,8 @@ static int
virtio_pci_set_host_notifier_internal(VirtIOPCIProxy *proxy,
         if (modern) {
             memory_region_add_eventfd(modern_mr, modern_addr, 2,
                                       true, n, notifier);
+            memory_region_add_eventfd(modern_mr, modern_addr, 0,
+                                      false, n, notifier);
         }
         if (legacy) {
             memory_region_add_eventfd(legacy_mr, legacy_addr, 2,
@@ -297,6 +299,8 @@ static int
virtio_pci_set_host_notifier_internal(VirtIOPCIProxy *proxy,
         if (modern) {
             memory_region_del_eventfd(modern_mr, modern_addr, 2,
                                       true, n, notifier);
+            memory_region_del_eventfd(modern_mr, modern_addr, 0,
+                                      false, n, notifier);
         }
         if (legacy) {
             memory_region_del_eventfd(legacy_mr, legacy_addr, 2,


^ permalink raw reply related	[flat|nested] 19+ messages in thread

* Re: [PATCH V3 2/3] kvm: don't register wildcard MMIO EVENTFD on two buses
  2015-08-31  8:03           ` Jason Wang
@ 2015-08-31 11:33             ` Michael S. Tsirkin
  2015-09-01  3:33               ` Jason Wang
  0 siblings, 1 reply; 19+ messages in thread
From: Michael S. Tsirkin @ 2015-08-31 11:33 UTC (permalink / raw)
  To: Jason Wang; +Cc: gleb, pbonzini, kvm, linux-kernel, cornelia.huck

On Mon, Aug 31, 2015 at 04:03:59PM +0800, Jason Wang wrote:
> 
> 
> On 08/31/2015 03:29 PM, Michael S. Tsirkin wrote:
> >>>>> Thinking more about this, invoking the 0-length write after
> >>>>> > >> > the != 0 length one would be better: it would mean we only
> >>>>> > >> > handle the userspace MMIO like this.
> >>> > > Right.
> >>> > >
> >> > 
> >> > Using current unittest. This patch is about 2.9% slower than before, and
> >> > invoking 0-length write after is still 1.1% slower (mmio-datamatch-eventfd).
> >> > 
> >> > /patch/result/-+%/
> >> > /base/2957/0/
> >> > /V3/3043/+2.9%/
> >> > /V3+invoking != 0 length first/2990/+1.1%/
> >> > 
> >> > So looks like the best method is not searching KVM_FAST_MMIO_BUS during
> >> > KVM_MMIO_BUS. Instead, let userspace to register both datamatch and
> >> > wildcard in this case. Does this sound good to you?
> > No - we can't change userspace.
> 
> Actually, the change was as simple as following. So I don't get the
> reason why.

Because it's too late - we committed to a specific userspace ABI
when this was merged in kernel, we must maintain it.
Even if I thought yours is a good API (and I don't BTW - it's exposing
internal implementation details) it's too late to change it.

> diff --git a/hw/virtio/virtio-pci.c b/hw/virtio/virtio-pci.c
> index 9935029..42ee986 100644
> --- a/hw/virtio/virtio-pci.c
> +++ b/hw/virtio/virtio-pci.c
> @@ -288,6 +288,8 @@ static int
> virtio_pci_set_host_notifier_internal(VirtIOPCIProxy *proxy,
>          if (modern) {
>              memory_region_add_eventfd(modern_mr, modern_addr, 2,
>                                        true, n, notifier);
> +            memory_region_add_eventfd(modern_mr, modern_addr, 0,
> +                                      false, n, notifier);
>          }
>          if (legacy) {
>              memory_region_add_eventfd(legacy_mr, legacy_addr, 2,
> @@ -297,6 +299,8 @@ static int
> virtio_pci_set_host_notifier_internal(VirtIOPCIProxy *proxy,
>          if (modern) {
>              memory_region_del_eventfd(modern_mr, modern_addr, 2,
>                                        true, n, notifier);
> +            memory_region_del_eventfd(modern_mr, modern_addr, 0,
> +                                      false, n, notifier);
>          }
>          if (legacy) {
>              memory_region_del_eventfd(legacy_mr, legacy_addr, 2,

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH V3 2/3] kvm: don't register wildcard MMIO EVENTFD on two buses
  2015-08-31 11:33             ` Michael S. Tsirkin
@ 2015-09-01  3:33               ` Jason Wang
  2015-09-01  4:31                 ` Michael S. Tsirkin
  0 siblings, 1 reply; 19+ messages in thread
From: Jason Wang @ 2015-09-01  3:33 UTC (permalink / raw)
  To: Michael S. Tsirkin; +Cc: gleb, pbonzini, kvm, linux-kernel, cornelia.huck



On 08/31/2015 07:33 PM, Michael S. Tsirkin wrote:
> On Mon, Aug 31, 2015 at 04:03:59PM +0800, Jason Wang wrote:
>> > 
>> > 
>> > On 08/31/2015 03:29 PM, Michael S. Tsirkin wrote:
>>>>>>> > >>>>> Thinking more about this, invoking the 0-length write after
>>>>>>>>>>> > >>>>> > >> > the != 0 length one would be better: it would mean we only
>>>>>>>>>>> > >>>>> > >> > handle the userspace MMIO like this.
>>>>>>> > >>> > > Right.
>>>>>>> > >>> > >
>>>>> > >> > 
>>>>> > >> > Using current unittest. This patch is about 2.9% slower than before, and
>>>>> > >> > invoking 0-length write after is still 1.1% slower (mmio-datamatch-eventfd).
>>>>> > >> > 
>>>>> > >> > /patch/result/-+%/
>>>>> > >> > /base/2957/0/
>>>>> > >> > /V3/3043/+2.9%/
>>>>> > >> > /V3+invoking != 0 length first/2990/+1.1%/
>>>>> > >> > 
>>>>> > >> > So looks like the best method is not searching KVM_FAST_MMIO_BUS during
>>>>> > >> > KVM_MMIO_BUS. Instead, let userspace to register both datamatch and
>>>>> > >> > wildcard in this case. Does this sound good to you?
>>> > > No - we can't change userspace.
>> > 
>> > Actually, the change was as simple as following. So I don't get the
>> > reason why.
> Because it's too late - we committed to a specific userspace ABI
> when this was merged in kernel, we must maintain it.

Ok ( Though I don't think it has real users for this now because it was
actually broken).

> Even if I thought yours is a good API (and I don't BTW - it's exposing
> internal implementation details) it's too late to change it.

I believe we should document the special treatment in kernel of zero
length mmio eventfd in api.txt? If yes, is this an exposing? If not, how
can userspace know the advantages of this and use it? For better API,
probably we need another new flag just for fast mmio and obsolete
current one by failing the assigning for zero length mmio eventfd.


^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH V3 2/3] kvm: don't register wildcard MMIO EVENTFD on two buses
  2015-09-01  3:33               ` Jason Wang
@ 2015-09-01  4:31                 ` Michael S. Tsirkin
  2015-09-01  4:47                   ` Jason Wang
  0 siblings, 1 reply; 19+ messages in thread
From: Michael S. Tsirkin @ 2015-09-01  4:31 UTC (permalink / raw)
  To: Jason Wang; +Cc: gleb, pbonzini, kvm, linux-kernel, cornelia.huck

On Tue, Sep 01, 2015 at 11:33:43AM +0800, Jason Wang wrote:
> 
> 
> On 08/31/2015 07:33 PM, Michael S. Tsirkin wrote:
> > On Mon, Aug 31, 2015 at 04:03:59PM +0800, Jason Wang wrote:
> >> > 
> >> > 
> >> > On 08/31/2015 03:29 PM, Michael S. Tsirkin wrote:
> >>>>>>> > >>>>> Thinking more about this, invoking the 0-length write after
> >>>>>>>>>>> > >>>>> > >> > the != 0 length one would be better: it would mean we only
> >>>>>>>>>>> > >>>>> > >> > handle the userspace MMIO like this.
> >>>>>>> > >>> > > Right.
> >>>>>>> > >>> > >
> >>>>> > >> > 
> >>>>> > >> > Using current unittest. This patch is about 2.9% slower than before, and
> >>>>> > >> > invoking 0-length write after is still 1.1% slower (mmio-datamatch-eventfd).
> >>>>> > >> > 
> >>>>> > >> > /patch/result/-+%/
> >>>>> > >> > /base/2957/0/
> >>>>> > >> > /V3/3043/+2.9%/
> >>>>> > >> > /V3+invoking != 0 length first/2990/+1.1%/
> >>>>> > >> > 
> >>>>> > >> > So looks like the best method is not searching KVM_FAST_MMIO_BUS during
> >>>>> > >> > KVM_MMIO_BUS. Instead, let userspace to register both datamatch and
> >>>>> > >> > wildcard in this case. Does this sound good to you?
> >>> > > No - we can't change userspace.
> >> > 
> >> > Actually, the change was as simple as following. So I don't get the
> >> > reason why.
> > Because it's too late - we committed to a specific userspace ABI
> > when this was merged in kernel, we must maintain it.
> 
> Ok ( Though I don't think it has real users for this now because it was
> actually broken).

It actually worked most of the time - you only trigger a use after free
on deregister.

> > Even if I thought yours is a good API (and I don't BTW - it's exposing
> > internal implementation details) it's too late to change it.
> 
> I believe we should document the special treatment in kernel of zero
> length mmio eventfd in api.txt? If yes, is this an exposing? If not, how
> can userspace know the advantages of this and use it? For better API,
> probably we need another new flag just for fast mmio and obsolete
> current one by failing the assigning for zero length mmio eventfd.

I sent a patch to update api.txt already as part of
kvm: add KVM_CAP_IOEVENTFD_PF capability.
I should probably split it out.

Sorry, I don't think the api change you propose makes sense - just fix the
crash in the existing one.

-- 
MST

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH V3 2/3] kvm: don't register wildcard MMIO EVENTFD on two buses
  2015-09-01  4:31                 ` Michael S. Tsirkin
@ 2015-09-01  4:47                   ` Jason Wang
  2015-09-01  6:54                     ` Michael S. Tsirkin
  0 siblings, 1 reply; 19+ messages in thread
From: Jason Wang @ 2015-09-01  4:47 UTC (permalink / raw)
  To: Michael S. Tsirkin; +Cc: gleb, pbonzini, kvm, linux-kernel, cornelia.huck



On 09/01/2015 12:31 PM, Michael S. Tsirkin wrote:
> On Tue, Sep 01, 2015 at 11:33:43AM +0800, Jason Wang wrote:
>>
>> On 08/31/2015 07:33 PM, Michael S. Tsirkin wrote:
>>> On Mon, Aug 31, 2015 at 04:03:59PM +0800, Jason Wang wrote:
>>>>>
>>>>> On 08/31/2015 03:29 PM, Michael S. Tsirkin wrote:
>>>>>>>>>>>>>>> Thinking more about this, invoking the 0-length write after
>>>>>>>>>>>>>>>>>>>>>>> the != 0 length one would be better: it would mean we only
>>>>>>>>>>>>>>>>>>>>>>> handle the userspace MMIO like this.
>>>>>>>>>>>>>>> Right.
>>>>>>>>>>>>>>>
>>>>>>>>>>> Using current unittest. This patch is about 2.9% slower than before, and
>>>>>>>>>>> invoking 0-length write after is still 1.1% slower (mmio-datamatch-eventfd).
>>>>>>>>>>>
>>>>>>>>>>> /patch/result/-+%/
>>>>>>>>>>> /base/2957/0/
>>>>>>>>>>> /V3/3043/+2.9%/
>>>>>>>>>>> /V3+invoking != 0 length first/2990/+1.1%/
>>>>>>>>>>>
>>>>>>>>>>> So looks like the best method is not searching KVM_FAST_MMIO_BUS during
>>>>>>>>>>> KVM_MMIO_BUS. Instead, let userspace to register both datamatch and
>>>>>>>>>>> wildcard in this case. Does this sound good to you?
>>>>>>> No - we can't change userspace.
>>>>> Actually, the change was as simple as following. So I don't get the
>>>>> reason why.
>>> Because it's too late - we committed to a specific userspace ABI
>>> when this was merged in kernel, we must maintain it.
>> Ok ( Though I don't think it has real users for this now because it was
>> actually broken).
> It actually worked most of the time - you only trigger a use after free
> on deregister.
>

It doesn't work for amd and intel machine without ept.

>>> Even if I thought yours is a good API (and I don't BTW - it's exposing
>>> internal implementation details) it's too late to change it.
>> I believe we should document the special treatment in kernel of zero
>> length mmio eventfd in api.txt? If yes, is this an exposing? If not, how
>> can userspace know the advantages of this and use it? For better API,
>> probably we need another new flag just for fast mmio and obsolete
>> current one by failing the assigning for zero length mmio eventfd.
> I sent a patch to update api.txt already as part of
> kvm: add KVM_CAP_IOEVENTFD_PF capability.
> I should probably split it out.
>
> Sorry, I don't think the api change you propose makes sense - just fix the
> crash in the existing one.
>

Ok, so I believe the fix should go:

- having two ioeventfds when we want to assign zero length mmio eventfd
- change the kvm_io_bus_sort_cmp() and can handle zero length correctly

What's your thought?

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH V3 2/3] kvm: don't register wildcard MMIO EVENTFD on two buses
  2015-09-01  4:47                   ` Jason Wang
@ 2015-09-01  6:54                     ` Michael S. Tsirkin
  2015-09-01  8:22                       ` Jason Wang
  0 siblings, 1 reply; 19+ messages in thread
From: Michael S. Tsirkin @ 2015-09-01  6:54 UTC (permalink / raw)
  To: Jason Wang; +Cc: gleb, pbonzini, kvm, linux-kernel, cornelia.huck

On Tue, Sep 01, 2015 at 12:47:36PM +0800, Jason Wang wrote:
> 
> 
> On 09/01/2015 12:31 PM, Michael S. Tsirkin wrote:
> > On Tue, Sep 01, 2015 at 11:33:43AM +0800, Jason Wang wrote:
> >>
> >> On 08/31/2015 07:33 PM, Michael S. Tsirkin wrote:
> >>> On Mon, Aug 31, 2015 at 04:03:59PM +0800, Jason Wang wrote:
> >>>>>
> >>>>> On 08/31/2015 03:29 PM, Michael S. Tsirkin wrote:
> >>>>>>>>>>>>>>> Thinking more about this, invoking the 0-length write after
> >>>>>>>>>>>>>>>>>>>>>>> the != 0 length one would be better: it would mean we only
> >>>>>>>>>>>>>>>>>>>>>>> handle the userspace MMIO like this.
> >>>>>>>>>>>>>>> Right.
> >>>>>>>>>>>>>>>
> >>>>>>>>>>> Using current unittest. This patch is about 2.9% slower than before, and
> >>>>>>>>>>> invoking 0-length write after is still 1.1% slower (mmio-datamatch-eventfd).
> >>>>>>>>>>>
> >>>>>>>>>>> /patch/result/-+%/
> >>>>>>>>>>> /base/2957/0/
> >>>>>>>>>>> /V3/3043/+2.9%/
> >>>>>>>>>>> /V3+invoking != 0 length first/2990/+1.1%/
> >>>>>>>>>>>
> >>>>>>>>>>> So looks like the best method is not searching KVM_FAST_MMIO_BUS during
> >>>>>>>>>>> KVM_MMIO_BUS. Instead, let userspace to register both datamatch and
> >>>>>>>>>>> wildcard in this case. Does this sound good to you?
> >>>>>>> No - we can't change userspace.
> >>>>> Actually, the change was as simple as following. So I don't get the
> >>>>> reason why.
> >>> Because it's too late - we committed to a specific userspace ABI
> >>> when this was merged in kernel, we must maintain it.
> >> Ok ( Though I don't think it has real users for this now because it was
> >> actually broken).
> > It actually worked most of the time - you only trigger a use after free
> > on deregister.
> >
> 
> It doesn't work for amd and intel machine without ept.

I thought it does :(

> >>> Even if I thought yours is a good API (and I don't BTW - it's exposing
> >>> internal implementation details) it's too late to change it.
> >> I believe we should document the special treatment in kernel of zero
> >> length mmio eventfd in api.txt? If yes, is this an exposing? If not, how
> >> can userspace know the advantages of this and use it? For better API,
> >> probably we need another new flag just for fast mmio and obsolete
> >> current one by failing the assigning for zero length mmio eventfd.
> > I sent a patch to update api.txt already as part of
> > kvm: add KVM_CAP_IOEVENTFD_PF capability.
> > I should probably split it out.
> >
> > Sorry, I don't think the api change you propose makes sense - just fix the
> > crash in the existing one.
> >
> 
> Ok, so I believe the fix should go:
> 
> - having two ioeventfds when we want to assign zero length mmio eventfd

You mean the in-kernel data structures?

> - change the kvm_io_bus_sort_cmp() and can handle zero length correctly

This one's for amd/non ept, right? I'd rather we implemented the
fast mmio optimization for these.

> What's your thought?

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH V3 2/3] kvm: don't register wildcard MMIO EVENTFD on two buses
  2015-09-01  6:54                     ` Michael S. Tsirkin
@ 2015-09-01  8:22                       ` Jason Wang
  2015-09-01  8:32                         ` Michael S. Tsirkin
  0 siblings, 1 reply; 19+ messages in thread
From: Jason Wang @ 2015-09-01  8:22 UTC (permalink / raw)
  To: Michael S. Tsirkin; +Cc: gleb, pbonzini, kvm, linux-kernel, cornelia.huck



On 09/01/2015 02:54 PM, Michael S. Tsirkin wrote:
> On Tue, Sep 01, 2015 at 12:47:36PM +0800, Jason Wang wrote:
>>
>> On 09/01/2015 12:31 PM, Michael S. Tsirkin wrote:
>>> On Tue, Sep 01, 2015 at 11:33:43AM +0800, Jason Wang wrote:
>>>> On 08/31/2015 07:33 PM, Michael S. Tsirkin wrote:
>>>>> On Mon, Aug 31, 2015 at 04:03:59PM +0800, Jason Wang wrote:
>>>>>>> On 08/31/2015 03:29 PM, Michael S. Tsirkin wrote:
>>>>>>>>>>>>>>>>> Thinking more about this, invoking the 0-length write after
>>>>>>>>>>>>>>>>>>>>>>>>> the != 0 length one would be better: it would mean we only
>>>>>>>>>>>>>>>>>>>>>>>>> handle the userspace MMIO like this.
>>>>>>>>>>>>>>>>> Right.
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>> Using current unittest. This patch is about 2.9% slower than before, and
>>>>>>>>>>>>> invoking 0-length write after is still 1.1% slower (mmio-datamatch-eventfd).
>>>>>>>>>>>>>
>>>>>>>>>>>>> /patch/result/-+%/
>>>>>>>>>>>>> /base/2957/0/
>>>>>>>>>>>>> /V3/3043/+2.9%/
>>>>>>>>>>>>> /V3+invoking != 0 length first/2990/+1.1%/
>>>>>>>>>>>>>
>>>>>>>>>>>>> So looks like the best method is not searching KVM_FAST_MMIO_BUS during
>>>>>>>>>>>>> KVM_MMIO_BUS. Instead, let userspace to register both datamatch and
>>>>>>>>>>>>> wildcard in this case. Does this sound good to you?
>>>>>>>>> No - we can't change userspace.
>>>>>>> Actually, the change was as simple as following. So I don't get the
>>>>>>> reason why.
>>>>> Because it's too late - we committed to a specific userspace ABI
>>>>> when this was merged in kernel, we must maintain it.
>>>> Ok ( Though I don't think it has real users for this now because it was
>>>> actually broken).
>>> It actually worked most of the time - you only trigger a use after free
>>> on deregister.
>>>
>> It doesn't work for amd and intel machine without ept.
> I thought it does :(
>
>>>>> Even if I thought yours is a good API (and I don't BTW - it's exposing
>>>>> internal implementation details) it's too late to change it.
>>>> I believe we should document the special treatment in kernel of zero
>>>> length mmio eventfd in api.txt? If yes, is this an exposing? If not, how
>>>> can userspace know the advantages of this and use it? For better API,
>>>> probably we need another new flag just for fast mmio and obsolete
>>>> current one by failing the assigning for zero length mmio eventfd.
>>> I sent a patch to update api.txt already as part of
>>> kvm: add KVM_CAP_IOEVENTFD_PF capability.
>>> I should probably split it out.
>>>
>>> Sorry, I don't think the api change you propose makes sense - just fix the
>>> crash in the existing one.
>>>
>> Ok, so I believe the fix should go:
>>
>> - having two ioeventfds when we want to assign zero length mmio eventfd
> You mean the in-kernel data structures?

Yes.

>
>> - change the kvm_io_bus_sort_cmp() and can handle zero length correctly
> This one's for amd/non ept, right? I'd rather we implemented the
> fast mmio optimization for these.

Agree, but we'd better fix it and backport it to stable first?


^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH V3 2/3] kvm: don't register wildcard MMIO EVENTFD on two buses
  2015-09-01  8:22                       ` Jason Wang
@ 2015-09-01  8:32                         ` Michael S. Tsirkin
  0 siblings, 0 replies; 19+ messages in thread
From: Michael S. Tsirkin @ 2015-09-01  8:32 UTC (permalink / raw)
  To: Jason Wang; +Cc: gleb, pbonzini, kvm, linux-kernel, cornelia.huck

On Tue, Sep 01, 2015 at 04:22:22PM +0800, Jason Wang wrote:
> 
> 
> On 09/01/2015 02:54 PM, Michael S. Tsirkin wrote:
> > On Tue, Sep 01, 2015 at 12:47:36PM +0800, Jason Wang wrote:
> >>
> >> On 09/01/2015 12:31 PM, Michael S. Tsirkin wrote:
> >>> On Tue, Sep 01, 2015 at 11:33:43AM +0800, Jason Wang wrote:
> >>>> On 08/31/2015 07:33 PM, Michael S. Tsirkin wrote:
> >>>>> On Mon, Aug 31, 2015 at 04:03:59PM +0800, Jason Wang wrote:
> >>>>>>> On 08/31/2015 03:29 PM, Michael S. Tsirkin wrote:
> >>>>>>>>>>>>>>>>> Thinking more about this, invoking the 0-length write after
> >>>>>>>>>>>>>>>>>>>>>>>>> the != 0 length one would be better: it would mean we only
> >>>>>>>>>>>>>>>>>>>>>>>>> handle the userspace MMIO like this.
> >>>>>>>>>>>>>>>>> Right.
> >>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>> Using current unittest. This patch is about 2.9% slower than before, and
> >>>>>>>>>>>>> invoking 0-length write after is still 1.1% slower (mmio-datamatch-eventfd).
> >>>>>>>>>>>>>
> >>>>>>>>>>>>> /patch/result/-+%/
> >>>>>>>>>>>>> /base/2957/0/
> >>>>>>>>>>>>> /V3/3043/+2.9%/
> >>>>>>>>>>>>> /V3+invoking != 0 length first/2990/+1.1%/
> >>>>>>>>>>>>>
> >>>>>>>>>>>>> So looks like the best method is not searching KVM_FAST_MMIO_BUS during
> >>>>>>>>>>>>> KVM_MMIO_BUS. Instead, let userspace to register both datamatch and
> >>>>>>>>>>>>> wildcard in this case. Does this sound good to you?
> >>>>>>>>> No - we can't change userspace.
> >>>>>>> Actually, the change was as simple as following. So I don't get the
> >>>>>>> reason why.
> >>>>> Because it's too late - we committed to a specific userspace ABI
> >>>>> when this was merged in kernel, we must maintain it.
> >>>> Ok ( Though I don't think it has real users for this now because it was
> >>>> actually broken).
> >>> It actually worked most of the time - you only trigger a use after free
> >>> on deregister.
> >>>
> >> It doesn't work for amd and intel machine without ept.
> > I thought it does :(
> >
> >>>>> Even if I thought yours is a good API (and I don't BTW - it's exposing
> >>>>> internal implementation details) it's too late to change it.
> >>>> I believe we should document the special treatment in kernel of zero
> >>>> length mmio eventfd in api.txt? If yes, is this an exposing? If not, how
> >>>> can userspace know the advantages of this and use it? For better API,
> >>>> probably we need another new flag just for fast mmio and obsolete
> >>>> current one by failing the assigning for zero length mmio eventfd.
> >>> I sent a patch to update api.txt already as part of
> >>> kvm: add KVM_CAP_IOEVENTFD_PF capability.
> >>> I should probably split it out.
> >>>
> >>> Sorry, I don't think the api change you propose makes sense - just fix the
> >>> crash in the existing one.
> >>>
> >> Ok, so I believe the fix should go:
> >>
> >> - having two ioeventfds when we want to assign zero length mmio eventfd
> > You mean the in-kernel data structures?
> 
> Yes.
> 
> >
> >> - change the kvm_io_bus_sort_cmp() and can handle zero length correctly
> > This one's for amd/non ept, right? I'd rather we implemented the
> > fast mmio optimization for these.
> 
> Agree, but we'd better fix it and backport it to stable first?

I would say fix it upstream first. Worry about stable later.  And I
don't see a lot of value in adding a temporary hack - it's not too much
work to just do the optimal thing directly.

But I won't nack a temporary solution if you insist.

-- 
MST

^ permalink raw reply	[flat|nested] 19+ messages in thread

* Re: [PATCH V3 1/3] kvm: use kmalloc() instead of kzalloc() during iodev register/unregister
  2015-08-25  9:05 [PATCH V3 1/3] kvm: use kmalloc() instead of kzalloc() during iodev register/unregister Jason Wang
  2015-08-25  9:05 ` [PATCH V3 2/3] kvm: don't register wildcard MMIO EVENTFD on two buses Jason Wang
  2015-08-25  9:05 ` [PATCH V3 3/3] kvm: add tracepoint for fast mmio Jason Wang
@ 2015-09-07 10:33 ` Paolo Bonzini
  2 siblings, 0 replies; 19+ messages in thread
From: Paolo Bonzini @ 2015-09-07 10:33 UTC (permalink / raw)
  To: Jason Wang, gleb, kvm, linux-kernel; +Cc: cornelia.huck, Michael S. Tsirkin



On 25/08/2015 11:05, Jason Wang wrote:
> All fields of kvm_io_range were initialized or copied explicitly
> afterwards. So switch to use kmalloc().
> 
> Cc: Gleb Natapov <gleb@kernel.org>
> Cc: Paolo Bonzini <pbonzini@redhat.com>
> Cc: Michael S. Tsirkin <mst@redhat.com>
> Signed-off-by: Jason Wang <jasowang@redhat.com>
> ---
>  virt/kvm/kvm_main.c | 4 ++--
>  1 file changed, 2 insertions(+), 2 deletions(-)
> 
> diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c
> index 8b8a444..0d79fe8 100644
> --- a/virt/kvm/kvm_main.c
> +++ b/virt/kvm/kvm_main.c
> @@ -3248,7 +3248,7 @@ int kvm_io_bus_register_dev(struct kvm *kvm, enum kvm_bus bus_idx, gpa_t addr,
>  	if (bus->dev_count - bus->ioeventfd_count > NR_IOBUS_DEVS - 1)
>  		return -ENOSPC;
>  
> -	new_bus = kzalloc(sizeof(*bus) + ((bus->dev_count + 1) *
> +	new_bus = kmalloc(sizeof(*bus) + ((bus->dev_count + 1) *
>  			  sizeof(struct kvm_io_range)), GFP_KERNEL);
>  	if (!new_bus)
>  		return -ENOMEM;
> @@ -3280,7 +3280,7 @@ int kvm_io_bus_unregister_dev(struct kvm *kvm, enum kvm_bus bus_idx,
>  	if (r)
>  		return r;
>  
> -	new_bus = kzalloc(sizeof(*bus) + ((bus->dev_count - 1) *
> +	new_bus = kmalloc(sizeof(*bus) + ((bus->dev_count - 1) *
>  			  sizeof(struct kvm_io_range)), GFP_KERNEL);
>  	if (!new_bus)
>  		return -ENOMEM;
> 

Applied this patch for 4.4.

Paolo

^ permalink raw reply	[flat|nested] 19+ messages in thread

end of thread, other threads:[~2015-09-07 10:33 UTC | newest]

Thread overview: 19+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2015-08-25  9:05 [PATCH V3 1/3] kvm: use kmalloc() instead of kzalloc() during iodev register/unregister Jason Wang
2015-08-25  9:05 ` [PATCH V3 2/3] kvm: don't register wildcard MMIO EVENTFD on two buses Jason Wang
2015-08-25 10:24   ` Cornelia Huck
2015-08-25 11:37   ` Michael S. Tsirkin
2015-08-25 11:51   ` Michael S. Tsirkin
2015-08-26  5:10     ` Jason Wang
2015-08-31  3:12       ` Jason Wang
2015-08-31  7:29         ` Michael S. Tsirkin
2015-08-31  8:03           ` Jason Wang
2015-08-31 11:33             ` Michael S. Tsirkin
2015-09-01  3:33               ` Jason Wang
2015-09-01  4:31                 ` Michael S. Tsirkin
2015-09-01  4:47                   ` Jason Wang
2015-09-01  6:54                     ` Michael S. Tsirkin
2015-09-01  8:22                       ` Jason Wang
2015-09-01  8:32                         ` Michael S. Tsirkin
2015-08-25  9:05 ` [PATCH V3 3/3] kvm: add tracepoint for fast mmio Jason Wang
2015-08-25 11:36   ` Michael S. Tsirkin
2015-09-07 10:33 ` [PATCH V3 1/3] kvm: use kmalloc() instead of kzalloc() during iodev register/unregister Paolo Bonzini

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.