[PATCH v2 1/2] KVM: MMIO: Lock coalesced device when checking for available entry

All of lore.kernel.org
 help / color / mirror / Atom feed

* [PATCH v2 1/2] KVM: MMIO: Lock coalesced device when checking for available entry
@ 2011-07-15 11:37 Sasha Levin
  2011-07-15 11:37 ` [PATCH v2 2/2] x86: Raise the hard VCPU count limit Sasha Levin
  2011-07-18  8:11 ` [PATCH v2 1/2] KVM: MMIO: Lock coalesced device when checking for available entry Avi Kivity
  0 siblings, 2 replies; 12+ messages in thread
From: Sasha Levin @ 2011-07-15 11:37 UTC (permalink / raw)
  To: kvm; +Cc: Sasha Levin, Avi Kivity, Ingo Molnar, Marcelo Tosatti, Pekka Enberg

Move the check whether there are available entries to within the spinlock.
This allows working with larger amount of VCPUs and reduces premature
exits when using a large number of VCPUs.

Cc: Avi Kivity <avi@redhat.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Marcelo Tosatti <mtosatti@redhat.com>
Cc: Pekka Enberg <penberg@kernel.org>
Signed-off-by: Sasha Levin <levinsasha928@gmail.com>
---
 virt/kvm/coalesced_mmio.c |    9 ++++++---
 1 files changed, 6 insertions(+), 3 deletions(-)

diff --git a/virt/kvm/coalesced_mmio.c b/virt/kvm/coalesced_mmio.c
index fc84875..34188db 100644
--- a/virt/kvm/coalesced_mmio.c
+++ b/virt/kvm/coalesced_mmio.c
@@ -37,7 +37,7 @@ static int coalesced_mmio_in_range(struct kvm_coalesced_mmio_dev *dev,
 	 */
 	ring = dev->kvm->coalesced_mmio_ring;
 	avail = (ring->first - ring->last - 1) % KVM_COALESCED_MMIO_MAX;
-	if (avail < KVM_MAX_VCPUS) {
+	if (avail == 0) {
 		/* full */
 		return 0;
 	}
@@ -63,11 +63,14 @@ static int coalesced_mmio_write(struct kvm_io_device *this,
 {
 	struct kvm_coalesced_mmio_dev *dev = to_mmio(this);
 	struct kvm_coalesced_mmio_ring *ring = dev->kvm->coalesced_mmio_ring;
-	if (!coalesced_mmio_in_range(dev, addr, len))
-		return -EOPNOTSUPP;
 
 	spin_lock(&dev->lock);
 
+	if (!coalesced_mmio_in_range(dev, addr, len)) {
+		spin_unlock(&dev->lock);
+		return -EOPNOTSUPP;
+	}
+
 	/* copy data in first free entry of the ring */
 
 	ring->coalesced_mmio[ring->last].phys_addr = addr;
-- 
1.7.6


^ permalink raw reply related	[flat|nested] 12+ messages in thread

* [PATCH v2 2/2] x86: Raise the hard VCPU count limit
  2011-07-15 11:37 [PATCH v2 1/2] KVM: MMIO: Lock coalesced device when checking for available entry Sasha Levin
@ 2011-07-15 11:37 ` Sasha Levin
  2011-07-21  9:24   ` Ingo Molnar
  2011-07-18  8:11 ` [PATCH v2 1/2] KVM: MMIO: Lock coalesced device when checking for available entry Avi Kivity
  1 sibling, 1 reply; 12+ messages in thread
From: Sasha Levin @ 2011-07-15 11:37 UTC (permalink / raw)
  To: kvm; +Cc: Sasha Levin, Avi Kivity, Ingo Molnar, Marcelo Tosatti, Pekka Enberg

The patch raises the hard limit of VCPU count to 254.

This will allow developers to easily work on scalability
and will allow users to test high VCPU setups easily without
patching the kernel.

To prevent possible issues with current setups, KVM_CAP_NR_VCPUS
now returns the recommended VCPU limit (which is still 64) - this
should be a safe value for everybody, while a new KVM_CAP_MAX_VCPUS
returns the hard limit which is now 254.

Cc: Avi Kivity <avi@redhat.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Marcelo Tosatti <mtosatti@redhat.com>
Cc: Pekka Enberg <penberg@kernel.org>
Suggested-by: Pekka Enberg <penberg@cs.helsinki.fi>
Signed-off-by: Sasha Levin <levinsasha928@gmail.com>
---
 Documentation/virtual/kvm/api.txt |   11 +++++++++--
 arch/x86/include/asm/kvm_host.h   |    3 ++-
 arch/x86/kvm/x86.c                |    3 +++
 include/linux/kvm.h               |    3 ++-
 4 files changed, 16 insertions(+), 4 deletions(-)

diff --git a/Documentation/virtual/kvm/api.txt b/Documentation/virtual/kvm/api.txt
index b0e4b9c..c37022e 100644
--- a/Documentation/virtual/kvm/api.txt
+++ b/Documentation/virtual/kvm/api.txt
@@ -175,10 +175,17 @@ Parameters: vcpu id (apic id on x86)
 Returns: vcpu fd on success, -1 on error
 
 This API adds a vcpu to a virtual machine.  The vcpu id is a small integer
-in the range [0, max_vcpus).  You can use KVM_CAP_NR_VCPUS of the
-KVM_CHECK_EXTENSION ioctl() to determine the value for max_vcpus at run-time.
+in the range [0, max_vcpus).
+
+The recommended max_vcpus value can be retrieved using the KVM_CAP_NR_VCPUS of
+the KVM_CHECK_EXTENSION ioctl() at run-time.
+The maximum possible value for max_vcpus can be retrieved using the
+KVM_CAP_MAX_VCPUS of the KVM_CHECK_EXTENSION ioctl() at run-time.
+
 If the KVM_CAP_NR_VCPUS does not exist, you should assume that max_vcpus is 4
 cpus max.
+If the KVM_CAP_MAX_VCPUS does not exist, you should assume that max_vcpus is
+same as the value returned from KVM_CAP_NR_VCPUS.
 
 On powerpc using book3s_hv mode, the vcpus are mapped onto virtual
 threads in one or more virtual CPU cores.  (This is because the
diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h
index dd51c83..9256fa0 100644
--- a/arch/x86/include/asm/kvm_host.h
+++ b/arch/x86/include/asm/kvm_host.h
@@ -26,7 +26,8 @@
 #include <asm/mtrr.h>
 #include <asm/msr-index.h>
 
-#define KVM_MAX_VCPUS 64
+#define KVM_MAX_VCPUS 254
+#define KVM_SOFT_MAX_VCPUS 64
 #define KVM_MEMORY_SLOTS 32
 /* memory slots that does not exposed to userspace */
 #define KVM_PRIVATE_MEM_SLOTS 4
diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index 84a28ea..8360569 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -2086,6 +2086,9 @@ int kvm_dev_ioctl_check_extension(long ext)
 		r = !kvm_x86_ops->cpu_has_accelerated_tpr();
 		break;
 	case KVM_CAP_NR_VCPUS:
+		r = KVM_SOFT_MAX_VCPUS;
+		break;
+	case KVM_CAP_MAX_VCPUS:
 		r = KVM_MAX_VCPUS;
 		break;
 	case KVM_CAP_NR_MEMSLOTS:
diff --git a/include/linux/kvm.h b/include/linux/kvm.h
index 2c366b5..55f5afb 100644
--- a/include/linux/kvm.h
+++ b/include/linux/kvm.h
@@ -463,7 +463,7 @@ struct kvm_ppc_pvinfo {
 #define KVM_CAP_VAPIC 6
 #define KVM_CAP_EXT_CPUID 7
 #define KVM_CAP_CLOCKSOURCE 8
-#define KVM_CAP_NR_VCPUS 9       /* returns max vcpus per vm */
+#define KVM_CAP_NR_VCPUS 9       /* returns recommended max vcpus per vm */
 #define KVM_CAP_NR_MEMSLOTS 10   /* returns max memory slots per vm */
 #define KVM_CAP_PIT 11
 #define KVM_CAP_NOP_IO_DELAY 12
@@ -553,6 +553,7 @@ struct kvm_ppc_pvinfo {
 #define KVM_CAP_SPAPR_TCE 63
 #define KVM_CAP_PPC_SMT 64
 #define KVM_CAP_PPC_RMA	65
+#define KVM_CAP_MAX_VCPUS 66       /* returns max vcpus per vm */
 
 #ifdef KVM_CAP_IRQ_ROUTING
 
-- 
1.7.6


^ permalink raw reply related	[flat|nested] 12+ messages in thread

* Re: [PATCH v2 1/2] KVM: MMIO: Lock coalesced device when checking for available entry
  2011-07-15 11:37 [PATCH v2 1/2] KVM: MMIO: Lock coalesced device when checking for available entry Sasha Levin
  2011-07-15 11:37 ` [PATCH v2 2/2] x86: Raise the hard VCPU count limit Sasha Levin
@ 2011-07-18  8:11 ` Avi Kivity
  2011-07-18  9:29   ` Sasha Levin
  1 sibling, 1 reply; 12+ messages in thread
From: Avi Kivity @ 2011-07-18  8:11 UTC (permalink / raw)
  To: Sasha Levin; +Cc: kvm, Ingo Molnar, Marcelo Tosatti, Pekka Enberg

On 07/15/2011 02:37 PM, Sasha Levin wrote:
> Move the check whether there are available entries to within the spinlock.
> This allows working with larger amount of VCPUs and reduces premature
> exits when using a large number of VCPUs.
>
> Cc: Avi Kivity<avi@redhat.com>
> Cc: Ingo Molnar<mingo@elte.hu>
> Cc: Marcelo Tosatti<mtosatti@redhat.com>
> Cc: Pekka Enberg<penberg@kernel.org>
> Signed-off-by: Sasha Levin<levinsasha928@gmail.com>
> ---
>   virt/kvm/coalesced_mmio.c |    9 ++++++---
>   1 files changed, 6 insertions(+), 3 deletions(-)
>
> diff --git a/virt/kvm/coalesced_mmio.c b/virt/kvm/coalesced_mmio.c
> index fc84875..34188db 100644
> --- a/virt/kvm/coalesced_mmio.c
> +++ b/virt/kvm/coalesced_mmio.c
> @@ -37,7 +37,7 @@ static int coalesced_mmio_in_range(struct kvm_coalesced_mmio_dev *dev,
>   	 */
>   	ring = dev->kvm->coalesced_mmio_ring;
>   	avail = (ring->first - ring->last - 1) % KVM_COALESCED_MMIO_MAX;
> -	if (avail<  KVM_MAX_VCPUS) {
> +	if (avail == 0) {
>   		/* full */
>   		return 0;
>   	}
> @@ -63,11 +63,14 @@ static int coalesced_mmio_write(struct kvm_io_device *this,
>   {
>   	struct kvm_coalesced_mmio_dev *dev = to_mmio(this);
>   	struct kvm_coalesced_mmio_ring *ring = dev->kvm->coalesced_mmio_ring;
> -	if (!coalesced_mmio_in_range(dev, addr, len))
> -		return -EOPNOTSUPP;
>
>   	spin_lock(&dev->lock);
>
> +	if (!coalesced_mmio_in_range(dev, addr, len)) {
> +		spin_unlock(&dev->lock);
> +		return -EOPNOTSUPP;
> +	}
> +
>   	/* copy data in first free entry of the ring */

Hmm.  This means we take the lock for every I/O, whether it hits 
coalesced mmio or not.

We need to do the range check before taking the lock and the space check 
after taking the lock.

-- 
error compiling committee.c: too many arguments to function


^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH v2 1/2] KVM: MMIO: Lock coalesced device when checking for available entry
  2011-07-18  8:11 ` [PATCH v2 1/2] KVM: MMIO: Lock coalesced device when checking for available entry Avi Kivity
@ 2011-07-18  9:29   ` Sasha Levin
  2011-07-18  9:50     ` Avi Kivity
  0 siblings, 1 reply; 12+ messages in thread
From: Sasha Levin @ 2011-07-18  9:29 UTC (permalink / raw)
  To: Avi Kivity; +Cc: kvm, Ingo Molnar, Marcelo Tosatti, Pekka Enberg

On Mon, 2011-07-18 at 11:11 +0300, Avi Kivity wrote:
> On 07/15/2011 02:37 PM, Sasha Levin wrote:
> > Move the check whether there are available entries to within the spinlock.
> > This allows working with larger amount of VCPUs and reduces premature
> > exits when using a large number of VCPUs.
> >
> > Cc: Avi Kivity<avi@redhat.com>
> > Cc: Ingo Molnar<mingo@elte.hu>
> > Cc: Marcelo Tosatti<mtosatti@redhat.com>
> > Cc: Pekka Enberg<penberg@kernel.org>
> > Signed-off-by: Sasha Levin<levinsasha928@gmail.com>
> > ---
> >   virt/kvm/coalesced_mmio.c |    9 ++++++---
> >   1 files changed, 6 insertions(+), 3 deletions(-)
> >
> > diff --git a/virt/kvm/coalesced_mmio.c b/virt/kvm/coalesced_mmio.c
> > index fc84875..34188db 100644
> > --- a/virt/kvm/coalesced_mmio.c
> > +++ b/virt/kvm/coalesced_mmio.c
> > @@ -37,7 +37,7 @@ static int coalesced_mmio_in_range(struct kvm_coalesced_mmio_dev *dev,
> >   	 */
> >   	ring = dev->kvm->coalesced_mmio_ring;
> >   	avail = (ring->first - ring->last - 1) % KVM_COALESCED_MMIO_MAX;
> > -	if (avail<  KVM_MAX_VCPUS) {
> > +	if (avail == 0) {
> >   		/* full */
> >   		return 0;
> >   	}
> > @@ -63,11 +63,14 @@ static int coalesced_mmio_write(struct kvm_io_device *this,
> >   {
> >   	struct kvm_coalesced_mmio_dev *dev = to_mmio(this);
> >   	struct kvm_coalesced_mmio_ring *ring = dev->kvm->coalesced_mmio_ring;
> > -	if (!coalesced_mmio_in_range(dev, addr, len))
> > -		return -EOPNOTSUPP;
> >
> >   	spin_lock(&dev->lock);
> >
> > +	if (!coalesced_mmio_in_range(dev, addr, len)) {
> > +		spin_unlock(&dev->lock);
> > +		return -EOPNOTSUPP;
> > +	}
> > +
> >   	/* copy data in first free entry of the ring */
> 
> Hmm.  This means we take the lock for every I/O, whether it hits 
> coalesced mmio or not.
> 
> We need to do the range check before taking the lock and the space check 
> after taking the lock.
> 

I'll fix that.

Shouldn't the range check be also locked somehow? Currently it is
possible that a coalesced region was removed while we are checking the
ranges, and we won't issue a mmio exit as the host expects.

-- 

Sasha.


^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH v2 1/2] KVM: MMIO: Lock coalesced device when checking for available entry
  2011-07-18  9:29   ` Sasha Levin
@ 2011-07-18  9:50     ` Avi Kivity
  2011-07-18 10:15       ` Sasha Levin
  0 siblings, 1 reply; 12+ messages in thread
From: Avi Kivity @ 2011-07-18  9:50 UTC (permalink / raw)
  To: Sasha Levin; +Cc: kvm, Ingo Molnar, Marcelo Tosatti, Pekka Enberg

On 07/18/2011 12:29 PM, Sasha Levin wrote:
> >  Hmm.  This means we take the lock for every I/O, whether it hits
> >  coalesced mmio or not.
> >
> >  We need to do the range check before taking the lock and the space check
> >  after taking the lock.
> >
>
> I'll fix that.
>
> Shouldn't the range check be also locked somehow? Currently it is
> possible that a coalesced region was removed while we are checking the
> ranges, and we won't issue a mmio exit as the host expects

It's "locked" using rcu.

-- 
error compiling committee.c: too many arguments to function


^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH v2 1/2] KVM: MMIO: Lock coalesced device when checking for available entry
  2011-07-18  9:50     ` Avi Kivity
@ 2011-07-18 10:15       ` Sasha Levin
  2011-07-18 11:43         ` Avi Kivity
  0 siblings, 1 reply; 12+ messages in thread
From: Sasha Levin @ 2011-07-18 10:15 UTC (permalink / raw)
  To: Avi Kivity; +Cc: kvm, Ingo Molnar, Marcelo Tosatti, Pekka Enberg

On Mon, 2011-07-18 at 12:50 +0300, Avi Kivity wrote:
> On 07/18/2011 12:29 PM, Sasha Levin wrote:
> > >  Hmm.  This means we take the lock for every I/O, whether it hits
> > >  coalesced mmio or not.
> > >
> > >  We need to do the range check before taking the lock and the space check
> > >  after taking the lock.
> > >
> >
> > I'll fix that.
> >
> > Shouldn't the range check be also locked somehow? Currently it is
> > possible that a coalesced region was removed while we are checking the
> > ranges, and we won't issue a mmio exit as the host expects
> 
> It's "locked" using rcu.
> 

Where is that happening?

All the coalesced zones are stored under the coalesced "device" in a
simple array. When adding and removing zones, kvm->slots_lock is taken -
I don't see anything which prevents a range check during zone removal
unless slots_lock prevents IO.

-- 

Sasha.


^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH v2 1/2] KVM: MMIO: Lock coalesced device when checking for available entry
  2011-07-18 10:15       ` Sasha Levin
@ 2011-07-18 11:43         ` Avi Kivity
  2011-07-18 12:03           ` Sasha Levin
  0 siblings, 1 reply; 12+ messages in thread
From: Avi Kivity @ 2011-07-18 11:43 UTC (permalink / raw)
  To: Sasha Levin; +Cc: kvm, Ingo Molnar, Marcelo Tosatti, Pekka Enberg

On 07/18/2011 01:15 PM, Sasha Levin wrote:
> On Mon, 2011-07-18 at 12:50 +0300, Avi Kivity wrote:
> >  On 07/18/2011 12:29 PM, Sasha Levin wrote:
> >  >  >   Hmm.  This means we take the lock for every I/O, whether it hits
> >  >  >   coalesced mmio or not.
> >  >  >
> >  >  >   We need to do the range check before taking the lock and the space check
> >  >  >   after taking the lock.
> >  >  >
> >  >
> >  >  I'll fix that.
> >  >
> >  >  Shouldn't the range check be also locked somehow? Currently it is
> >  >  possible that a coalesced region was removed while we are checking the
> >  >  ranges, and we won't issue a mmio exit as the host expects
> >
> >  It's "locked" using rcu.
> >
>
> Where is that happening?
>
> All the coalesced zones are stored under the coalesced "device" in a
> simple array. When adding and removing zones, kvm->slots_lock is taken -
> I don't see anything which prevents a range check during zone removal
> unless slots_lock prevents IO.

Range check during slot removal is legal.  While you are removing a 
slot, a concurrent write may hit or miss the slot; it doesn't matter.

Userspace should flush the coalesced mmio buffer after removal to ensure 
there are no pending writes.

-- 
error compiling committee.c: too many arguments to function


^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH v2 1/2] KVM: MMIO: Lock coalesced device when checking for available entry
  2011-07-18 11:43         ` Avi Kivity
@ 2011-07-18 12:03           ` Sasha Levin
  2011-07-18 12:29             ` Avi Kivity
  0 siblings, 1 reply; 12+ messages in thread
From: Sasha Levin @ 2011-07-18 12:03 UTC (permalink / raw)
  To: Avi Kivity; +Cc: kvm, Ingo Molnar, Marcelo Tosatti, Pekka Enberg

On Mon, 2011-07-18 at 14:43 +0300, Avi Kivity wrote:
> On 07/18/2011 01:15 PM, Sasha Levin wrote:
> > On Mon, 2011-07-18 at 12:50 +0300, Avi Kivity wrote:
> > >  On 07/18/2011 12:29 PM, Sasha Levin wrote:
> > >  >  >   Hmm.  This means we take the lock for every I/O, whether it hits
> > >  >  >   coalesced mmio or not.
> > >  >  >
> > >  >  >   We need to do the range check before taking the lock and the space check
> > >  >  >   after taking the lock.
> > >  >  >
> > >  >
> > >  >  I'll fix that.
> > >  >
> > >  >  Shouldn't the range check be also locked somehow? Currently it is
> > >  >  possible that a coalesced region was removed while we are checking the
> > >  >  ranges, and we won't issue a mmio exit as the host expects
> > >
> > >  It's "locked" using rcu.
> > >
> >
> > Where is that happening?
> >
> > All the coalesced zones are stored under the coalesced "device" in a
> > simple array. When adding and removing zones, kvm->slots_lock is taken -
> > I don't see anything which prevents a range check during zone removal
> > unless slots_lock prevents IO.
> 
> Range check during slot removal is legal.  While you are removing a 
> slot, a concurrent write may hit or miss the slot; it doesn't matter.
> 
> Userspace should flush the coalesced mmio buffer after removal to ensure 
> there are no pending writes.
> 

But the write may hit a non-existent slot.

Something like this:

Thread 1		Thread 2
----------------------------------
Check range	|
Found slot	|
		| Remove slot
		| Flush buffer
Get spinlock	|
Write to buffer	|


-- 

Sasha.


^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH v2 1/2] KVM: MMIO: Lock coalesced device when checking for available entry
  2011-07-18 12:03           ` Sasha Levin
@ 2011-07-18 12:29             ` Avi Kivity
  2011-07-18 12:58               ` Sasha Levin
  0 siblings, 1 reply; 12+ messages in thread
From: Avi Kivity @ 2011-07-18 12:29 UTC (permalink / raw)
  To: Sasha Levin; +Cc: kvm, Ingo Molnar, Marcelo Tosatti, Pekka Enberg

On 07/18/2011 03:03 PM, Sasha Levin wrote:
> On Mon, 2011-07-18 at 14:43 +0300, Avi Kivity wrote:
> >  On 07/18/2011 01:15 PM, Sasha Levin wrote:
> >  >  On Mon, 2011-07-18 at 12:50 +0300, Avi Kivity wrote:
> >  >  >   On 07/18/2011 12:29 PM, Sasha Levin wrote:
> >  >  >   >   >    Hmm.  This means we take the lock for every I/O, whether it hits
> >  >  >   >   >    coalesced mmio or not.
> >  >  >   >   >
> >  >  >   >   >    We need to do the range check before taking the lock and the space check
> >  >  >   >   >    after taking the lock.
> >  >  >   >   >
> >  >  >   >
> >  >  >   >   I'll fix that.
> >  >  >   >
> >  >  >   >   Shouldn't the range check be also locked somehow? Currently it is
> >  >  >   >   possible that a coalesced region was removed while we are checking the
> >  >  >   >   ranges, and we won't issue a mmio exit as the host expects
> >  >  >
> >  >  >   It's "locked" using rcu.
> >  >  >
> >  >
> >  >  Where is that happening?
> >  >
> >  >  All the coalesced zones are stored under the coalesced "device" in a
> >  >  simple array. When adding and removing zones, kvm->slots_lock is taken -
> >  >  I don't see anything which prevents a range check during zone removal
> >  >  unless slots_lock prevents IO.
> >
> >  Range check during slot removal is legal.  While you are removing a
> >  slot, a concurrent write may hit or miss the slot; it doesn't matter.
> >
> >  Userspace should flush the coalesced mmio buffer after removal to ensure
> >  there are no pending writes.
> >
>
> But the write may hit a non-existent slot.
>
> Something like this:
>
> Thread 1		Thread 2
> ----------------------------------
> Check range	|
> Found slot	|
> 		| Remove slot
> 		| Flush buffer
> Get spinlock	|
> Write to buffer	|
>

Cannot happen, due to rcu.  The "remove slot" step waits until all rcu 
readers are gone.

In other words: it's magic.

-- 
error compiling committee.c: too many arguments to function


^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH v2 1/2] KVM: MMIO: Lock coalesced device when checking for available entry
  2011-07-18 12:29             ` Avi Kivity
@ 2011-07-18 12:58               ` Sasha Levin
  2011-07-18 13:11                 ` Avi Kivity
  0 siblings, 1 reply; 12+ messages in thread
From: Sasha Levin @ 2011-07-18 12:58 UTC (permalink / raw)
  To: Avi Kivity; +Cc: kvm, Ingo Molnar, Marcelo Tosatti, Pekka Enberg

On Mon, 2011-07-18 at 15:29 +0300, Avi Kivity wrote:
> On 07/18/2011 03:03 PM, Sasha Levin wrote:
> > On Mon, 2011-07-18 at 14:43 +0300, Avi Kivity wrote:
> > >  On 07/18/2011 01:15 PM, Sasha Levin wrote:
> > >  >  On Mon, 2011-07-18 at 12:50 +0300, Avi Kivity wrote:
> > >  >  >   On 07/18/2011 12:29 PM, Sasha Levin wrote:
> > >  >  >   >   >    Hmm.  This means we take the lock for every I/O, whether it hits
> > >  >  >   >   >    coalesced mmio or not.
> > >  >  >   >   >
> > >  >  >   >   >    We need to do the range check before taking the lock and the space check
> > >  >  >   >   >    after taking the lock.
> > >  >  >   >   >
> > >  >  >   >
> > >  >  >   >   I'll fix that.
> > >  >  >   >
> > >  >  >   >   Shouldn't the range check be also locked somehow? Currently it is
> > >  >  >   >   possible that a coalesced region was removed while we are checking the
> > >  >  >   >   ranges, and we won't issue a mmio exit as the host expects
> > >  >  >
> > >  >  >   It's "locked" using rcu.
> > >  >  >
> > >  >
> > >  >  Where is that happening?
> > >  >
> > >  >  All the coalesced zones are stored under the coalesced "device" in a
> > >  >  simple array. When adding and removing zones, kvm->slots_lock is taken -
> > >  >  I don't see anything which prevents a range check during zone removal
> > >  >  unless slots_lock prevents IO.
> > >
> > >  Range check during slot removal is legal.  While you are removing a
> > >  slot, a concurrent write may hit or miss the slot; it doesn't matter.
> > >
> > >  Userspace should flush the coalesced mmio buffer after removal to ensure
> > >  there are no pending writes.
> > >
> >
> > But the write may hit a non-existent slot.
> >
> > Something like this:
> >
> > Thread 1		Thread 2
> > ----------------------------------
> > Check range	|
> > Found slot	|
> > 		| Remove slot
> > 		| Flush buffer
> > Get spinlock	|
> > Write to buffer	|
> >
> 
> Cannot happen, due to rcu.  The "remove slot" step waits until all rcu 
> readers are gone.
> 
> In other words: it's magic.
> 

I might be missing something, but I don't see anything rcu related in
anything within /virt/kvm/coalesced_mmio.c or in
kvm_vm_ioctl_unregister_coalesced_mmio() specifically.

Where is rcu invoked on the zones array?

All I see is a simple array and counter declared as such:

	int nb_zones;
	struct kvm_coalesced_mmio_zone zone[KVM_COALESCED_MMIO_ZONE_MAX];

And in the register/unregister functions it's a simple array manipulation.

-- 

Sasha.


^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH v2 1/2] KVM: MMIO: Lock coalesced device when checking for available entry
  2011-07-18 12:58               ` Sasha Levin
@ 2011-07-18 13:11                 ` Avi Kivity
  0 siblings, 0 replies; 12+ messages in thread
From: Avi Kivity @ 2011-07-18 13:11 UTC (permalink / raw)
  To: Sasha Levin; +Cc: kvm, Ingo Molnar, Marcelo Tosatti, Pekka Enberg

On 07/18/2011 03:58 PM, Sasha Levin wrote:
> On Mon, 2011-07-18 at 15:29 +0300, Avi Kivity wrote:
> >  On 07/18/2011 03:03 PM, Sasha Levin wrote:
> >  >  On Mon, 2011-07-18 at 14:43 +0300, Avi Kivity wrote:
> >  >  >   On 07/18/2011 01:15 PM, Sasha Levin wrote:
> >  >  >   >   On Mon, 2011-07-18 at 12:50 +0300, Avi Kivity wrote:
> >  >  >   >   >    On 07/18/2011 12:29 PM, Sasha Levin wrote:
> >  >  >   >   >    >    >     Hmm.  This means we take the lock for every I/O, whether it hits
> >  >  >   >   >    >    >     coalesced mmio or not.
> >  >  >   >   >    >    >
> >  >  >   >   >    >    >     We need to do the range check before taking the lock and the space check
> >  >  >   >   >    >    >     after taking the lock.
> >  >  >   >   >    >    >
> >  >  >   >   >    >
> >  >  >   >   >    >    I'll fix that.
> >  >  >   >   >    >
> >  >  >   >   >    >    Shouldn't the range check be also locked somehow? Currently it is
> >  >  >   >   >    >    possible that a coalesced region was removed while we are checking the
> >  >  >   >   >    >    ranges, and we won't issue a mmio exit as the host expects
> >  >  >   >   >
> >  >  >   >   >    It's "locked" using rcu.
> >  >  >   >   >
> >  >  >   >
> >  >  >   >   Where is that happening?
> >  >  >   >
> >  >  >   >   All the coalesced zones are stored under the coalesced "device" in a
> >  >  >   >   simple array. When adding and removing zones, kvm->slots_lock is taken -
> >  >  >   >   I don't see anything which prevents a range check during zone removal
> >  >  >   >   unless slots_lock prevents IO.
> >  >  >
> >  >  >   Range check during slot removal is legal.  While you are removing a
> >  >  >   slot, a concurrent write may hit or miss the slot; it doesn't matter.
> >  >  >
> >  >  >   Userspace should flush the coalesced mmio buffer after removal to ensure
> >  >  >   there are no pending writes.
> >  >  >
> >  >
> >  >  But the write may hit a non-existent slot.
> >  >
> >  >  Something like this:
> >  >
> >  >  Thread 1		Thread 2
> >  >  ----------------------------------
> >  >  Check range	|
> >  >  Found slot	|
> >  >  		| Remove slot
> >  >  		| Flush buffer
> >  >  Get spinlock	|
> >  >  Write to buffer	|
> >  >
> >
> >  Cannot happen, due to rcu.  The "remove slot" step waits until all rcu
> >  readers are gone.
> >
> >  In other words: it's magic.
> >
>
> I might be missing something, but I don't see anything rcu related in
> anything within /virt/kvm/coalesced_mmio.c or in
> kvm_vm_ioctl_unregister_coalesced_mmio() specifically.
>
> Where is rcu invoked on the zones array?
>
> All I see is a simple array and counter declared as such:
>
> 	int nb_zones;
> 	struct kvm_coalesced_mmio_zone zone[KVM_COALESCED_MMIO_ZONE_MAX];
>
> And in the register/unregister functions it's a simple array manipulation.

Er, kvm_io_bus_register() does the rcu stuff.  But 
kvm_register_coalesced_mmio() doesn't call it! Instead it's just one 
device on the bus that decodes all those offsets.

In other words: it's broken.

Luckily, it's not exploitable, since the memory is static wrt the 
lifetime of the guest.

We should probably make it separate kvm_io_devices so we can reuse the 
existing locking.

-- 
error compiling committee.c: too many arguments to function


^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH v2 2/2] x86: Raise the hard VCPU count limit
  2011-07-15 11:37 ` [PATCH v2 2/2] x86: Raise the hard VCPU count limit Sasha Levin
@ 2011-07-21  9:24   ` Ingo Molnar
  0 siblings, 0 replies; 12+ messages in thread
From: Ingo Molnar @ 2011-07-21  9:24 UTC (permalink / raw)
  To: Sasha Levin; +Cc: kvm, Avi Kivity, Marcelo Tosatti, Pekka Enberg


* Sasha Levin <levinsasha928@gmail.com> wrote:

> The patch raises the hard limit of VCPU count to 254.
> 
> This will allow developers to easily work on scalability
> and will allow users to test high VCPU setups easily without
> patching the kernel.
> 
> To prevent possible issues with current setups, KVM_CAP_NR_VCPUS
> now returns the recommended VCPU limit (which is still 64) - this
> should be a safe value for everybody, while a new KVM_CAP_MAX_VCPUS
> returns the hard limit which is now 254.
> 
> Cc: Avi Kivity <avi@redhat.com>
> Cc: Ingo Molnar <mingo@elte.hu>
> Cc: Marcelo Tosatti <mtosatti@redhat.com>
> Cc: Pekka Enberg <penberg@kernel.org>
> Suggested-by: Pekka Enberg <penberg@cs.helsinki.fi>
> Signed-off-by: Sasha Levin <levinsasha928@gmail.com>

It would be nice to document the various internal limits that are 
present (255, etc.) and which prevent further raising of 
KVM_MAX_VCPUS.

This information got mapped out in this discussion, but the commit 
loses it so it would be nice to preserve it somewhere (at least in 
the changelog).

Thanks,

	Ingo

^ permalink raw reply	[flat|nested] 12+ messages in thread

end of thread, other threads:[~2011-07-21  9:25 UTC | newest]

Thread overview: 12+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2011-07-15 11:37 [PATCH v2 1/2] KVM: MMIO: Lock coalesced device when checking for available entry Sasha Levin
2011-07-15 11:37 ` [PATCH v2 2/2] x86: Raise the hard VCPU count limit Sasha Levin
2011-07-21  9:24   ` Ingo Molnar
2011-07-18  8:11 ` [PATCH v2 1/2] KVM: MMIO: Lock coalesced device when checking for available entry Avi Kivity
2011-07-18  9:29   ` Sasha Levin
2011-07-18  9:50     ` Avi Kivity
2011-07-18 10:15       ` Sasha Levin
2011-07-18 11:43         ` Avi Kivity
2011-07-18 12:03           ` Sasha Levin
2011-07-18 12:29             ` Avi Kivity
2011-07-18 12:58               ` Sasha Levin
2011-07-18 13:11                 ` Avi Kivity

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.