All of lore.kernel.org
 help / color / mirror / Atom feed
* [RFC][PATCH 0/3]: Fixes to IRQ routing
@ 2010-06-08 17:55 Chris Lalancette
  2010-06-08 17:55 ` [RFC][PATCH 1/3] Introduce a workqueue to deliver PIT timer interrupts Chris Lalancette
                   ` (3 more replies)
  0 siblings, 4 replies; 20+ messages in thread
From: Chris Lalancette @ 2010-06-08 17:55 UTC (permalink / raw)
  To: kvm; +Cc: avi, mtosatti

As we've discussed previously, here is a series of patches to
fix some of the IRQ routing issues we have in KVM.  With this series
in place I was able to successfully kdump a RHEL-5 64-bit, and RHEL-6
32- and 64-bit guest on CPU's other than the BSP.  RHEL-5 32-bit kdump still
does not work; it gets stuck on "Checking 'hlt' instruction".  However,
it does that both before and after this series, so there is something
else going on there that I still have to debug.

I also need to change the "kvm_migrate_pit_timer" function to migrate the
timer over to the last CPU that handled the timer interrupt, on the
theory that that particlar CPU is likely to handle the timer interrupt again
in the near future.

The other outstanding question about these patches is how to setup the
workqueue.  As it stands I have a single workqueue for all guests.  However,
for efficiency reasons we may want to have one workqueue per guest.  Opinions?

Chris Lalancette


^ permalink raw reply	[flat|nested] 20+ messages in thread

* [RFC][PATCH 1/3] Introduce a workqueue to deliver PIT timer interrupts.
  2010-06-08 17:55 [RFC][PATCH 0/3]: Fixes to IRQ routing Chris Lalancette
@ 2010-06-08 17:55 ` Chris Lalancette
  2010-06-09  2:36   ` Zachary Amsden
                     ` (2 more replies)
  2010-06-08 17:55 ` [RFC][PATCH 2/3] Allow any LAPIC to accept PIC interrupts Chris Lalancette
                   ` (2 subsequent siblings)
  3 siblings, 3 replies; 20+ messages in thread
From: Chris Lalancette @ 2010-06-08 17:55 UTC (permalink / raw)
  To: kvm; +Cc: avi, mtosatti, Chris Lalancette

We really want to "kvm_set_irq" during the hrtimer callback,
but that is risky because that is during interrupt context.
Instead, offload the work to a workqueue, which is a bit safer
and should provide most of the same functionality.

Signed-off-by: Chris Lalancette <clalance@redhat.com>
---
 arch/x86/kvm/i8254.c |  106 ++++++++++++++++++++++++++++++-------------------
 arch/x86/kvm/i8254.h |    4 ++
 arch/x86/kvm/irq.c   |    1 -
 arch/x86/kvm/x86.c   |    4 ++
 4 files changed, 73 insertions(+), 42 deletions(-)

diff --git a/arch/x86/kvm/i8254.c b/arch/x86/kvm/i8254.c
index 188d827..dd655ef 100644
--- a/arch/x86/kvm/i8254.c
+++ b/arch/x86/kvm/i8254.c
@@ -34,6 +34,7 @@
 
 #include <linux/kvm_host.h>
 #include <linux/slab.h>
+#include <linux/workqueue.h>
 
 #include "irq.h"
 #include "i8254.h"
@@ -49,6 +50,8 @@
 #define RW_STATE_WORD0 3
 #define RW_STATE_WORD1 4
 
+static struct workqueue_struct *pit_wq;
+
 /* Compute with 96 bit intermediate result: (a*b)/c */
 static u64 muldiv64(u64 a, u32 b, u32 c)
 {
@@ -281,6 +284,58 @@ static struct kvm_timer_ops kpit_ops = {
 	.is_periodic = kpit_is_periodic,
 };
 
+static void pit_do_work(struct work_struct *work)
+{
+       struct kvm_pit *pit = container_of(work, struct kvm_pit, expired);
+       struct kvm *kvm = pit->kvm;
+       struct kvm_vcpu *vcpu;
+       int i;
+       struct kvm_kpit_state *ps = &pit->pit_state;
+       int inject = 0;
+
+       /* Try to inject pending interrupts when
+        * last one has been acked.
+        */
+       raw_spin_lock(&ps->inject_lock);
+       if (ps->irq_ack) {
+               ps->irq_ack = 0;
+               inject = 1;
+       }
+       raw_spin_unlock(&ps->inject_lock);
+       if (inject) {
+               kvm_set_irq(kvm, kvm->arch.vpit->irq_source_id, 0, 1);
+               kvm_set_irq(kvm, kvm->arch.vpit->irq_source_id, 0, 0);
+
+               /*
+                * Provides NMI watchdog support via Virtual Wire mode.
+                * The route is: PIT -> PIC -> LVT0 in NMI mode.
+                *
+                * Note: Our Virtual Wire implementation is simplified, only
+                * propagating PIT interrupts to all VCPUs when they have set
+                * LVT0 to NMI delivery. Other PIC interrupts are just sent to
+                * VCPU0, and only if its LVT0 is in EXTINT mode.
+                */
+               if (kvm->arch.vapics_in_nmi_mode > 0)
+                       kvm_for_each_vcpu(i, vcpu, kvm)
+                               kvm_apic_nmi_wd_deliver(vcpu);
+       }
+}
+
+static enum hrtimer_restart pit_timer_fn(struct hrtimer *data)
+{
+       struct kvm_timer *ktimer = container_of(data, struct kvm_timer, timer);
+       struct kvm_pit *pt = ktimer->kvm->arch.vpit;
+
+       queue_work(pit_wq, &pt->expired);
+
+       if (ktimer->t_ops->is_periodic(ktimer)) {
+               hrtimer_add_expires_ns(&ktimer->timer, ktimer->period);
+              return HRTIMER_RESTART;
+       }
+       else
+              return HRTIMER_NORESTART;
+}
+
 static void create_pit_timer(struct kvm_kpit_state *ps, u32 val, int is_period)
 {
 	struct kvm_timer *pt = &ps->pit_timer;
@@ -295,10 +350,9 @@ static void create_pit_timer(struct kvm_kpit_state *ps, u32 val, int is_period)
 	pt->period = interval;
 	ps->is_periodic = is_period;
 
-	pt->timer.function = kvm_timer_fn;
+	pt->timer.function = pit_timer_fn;
 	pt->t_ops = &kpit_ops;
 	pt->kvm = ps->pit->kvm;
-	pt->vcpu = pt->kvm->bsp_vcpu;
 
 	atomic_set(&pt->pending, 0);
 	ps->irq_ack = 1;
@@ -628,6 +682,8 @@ struct kvm_pit *kvm_create_pit(struct kvm *kvm, u32 flags)
 	mutex_lock(&pit->pit_state.lock);
 	raw_spin_lock_init(&pit->pit_state.inject_lock);
 
+	INIT_WORK(&pit->expired, pit_do_work);
+
 	kvm->arch.vpit = pit;
 	pit->kvm = kvm;
 
@@ -691,48 +747,16 @@ void kvm_free_pit(struct kvm *kvm)
 	}
 }
 
-static void __inject_pit_timer_intr(struct kvm *kvm)
+int kvm_pit_create(void)
 {
-	struct kvm_vcpu *vcpu;
-	int i;
+       pit_wq = create_singlethread_workqueue("kvm-pit-wq");
+       if (!pit_wq)
+               return -ENOMEM;
 
-	kvm_set_irq(kvm, kvm->arch.vpit->irq_source_id, 0, 1);
-	kvm_set_irq(kvm, kvm->arch.vpit->irq_source_id, 0, 0);
-
-	/*
-	 * Provides NMI watchdog support via Virtual Wire mode.
-	 * The route is: PIT -> PIC -> LVT0 in NMI mode.
-	 *
-	 * Note: Our Virtual Wire implementation is simplified, only
-	 * propagating PIT interrupts to all VCPUs when they have set
-	 * LVT0 to NMI delivery. Other PIC interrupts are just sent to
-	 * VCPU0, and only if its LVT0 is in EXTINT mode.
-	 */
-	if (kvm->arch.vapics_in_nmi_mode > 0)
-		kvm_for_each_vcpu(i, vcpu, kvm)
-			kvm_apic_nmi_wd_deliver(vcpu);
+       return 0;
 }
 
-void kvm_inject_pit_timer_irqs(struct kvm_vcpu *vcpu)
+void kvm_pit_cleanup(void)
 {
-	struct kvm_pit *pit = vcpu->kvm->arch.vpit;
-	struct kvm *kvm = vcpu->kvm;
-	struct kvm_kpit_state *ps;
-
-	if (pit) {
-		int inject = 0;
-		ps = &pit->pit_state;
-
-		/* Try to inject pending interrupts when
-		 * last one has been acked.
-		 */
-		raw_spin_lock(&ps->inject_lock);
-		if (atomic_read(&ps->pit_timer.pending) && ps->irq_ack) {
-			ps->irq_ack = 0;
-			inject = 1;
-		}
-		raw_spin_unlock(&ps->inject_lock);
-		if (inject)
-			__inject_pit_timer_intr(kvm);
-	}
+       destroy_workqueue(pit_wq);
 }
diff --git a/arch/x86/kvm/i8254.h b/arch/x86/kvm/i8254.h
index 900d6b0..78db13d 100644
--- a/arch/x86/kvm/i8254.h
+++ b/arch/x86/kvm/i8254.h
@@ -40,6 +40,7 @@ struct kvm_pit {
 	struct kvm_kpit_state pit_state;
 	int irq_source_id;
 	struct kvm_irq_mask_notifier mask_notifier;
+	struct work_struct expired;
 };
 
 #define KVM_PIT_BASE_ADDRESS	    0x40
@@ -55,4 +56,7 @@ struct kvm_pit *kvm_create_pit(struct kvm *kvm, u32 flags);
 void kvm_free_pit(struct kvm *kvm);
 void kvm_pit_reset(struct kvm_pit *pit);
 
+int kvm_pit_create(void);
+void kvm_pit_cleanup(void);
+
 #endif
diff --git a/arch/x86/kvm/irq.c b/arch/x86/kvm/irq.c
index 0f4e488..2095a04 100644
--- a/arch/x86/kvm/irq.c
+++ b/arch/x86/kvm/irq.c
@@ -90,7 +90,6 @@ EXPORT_SYMBOL_GPL(kvm_cpu_get_interrupt);
 void kvm_inject_pending_timer_irqs(struct kvm_vcpu *vcpu)
 {
 	kvm_inject_apic_timer_irqs(vcpu);
-	kvm_inject_pit_timer_irqs(vcpu);
 	/* TODO: PIT, RTC etc. */
 }
 EXPORT_SYMBOL_GPL(kvm_inject_pending_timer_irqs);
diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index 5fa8684..a8cf1e1 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -4125,6 +4125,8 @@ int kvm_arch_init(void *opaque)
 
 	perf_register_guest_info_callbacks(&kvm_guest_cbs);
 
+	kvm_pit_create();
+
 	return 0;
 
 out:
@@ -4133,6 +4135,8 @@ out:
 
 void kvm_arch_exit(void)
 {
+	kvm_pit_cleanup();
+
 	perf_unregister_guest_info_callbacks(&kvm_guest_cbs);
 
 	if (!boot_cpu_has(X86_FEATURE_CONSTANT_TSC))
-- 
1.6.6.1


^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [RFC][PATCH 2/3] Allow any LAPIC to accept PIC interrupts.
  2010-06-08 17:55 [RFC][PATCH 0/3]: Fixes to IRQ routing Chris Lalancette
  2010-06-08 17:55 ` [RFC][PATCH 1/3] Introduce a workqueue to deliver PIT timer interrupts Chris Lalancette
@ 2010-06-08 17:55 ` Chris Lalancette
  2010-06-08 17:55 ` [RFC][PATCH 3/3] In DM_LOWEST, only deliver interrupts to vcpus with enabled LAPIC's Chris Lalancette
  2010-06-09 11:02 ` [RFC][PATCH 0/3]: Fixes to IRQ routing Avi Kivity
  3 siblings, 0 replies; 20+ messages in thread
From: Chris Lalancette @ 2010-06-08 17:55 UTC (permalink / raw)
  To: kvm; +Cc: avi, mtosatti, Chris Lalancette

If the guest wants to accept timer interrupts on a CPU other
than the BSP, we need to remove this gate.

Signed-off-by: Chris Lalancette <clalance@redhat.com>
---
 arch/x86/kvm/lapic.c |   12 +++++-------
 1 files changed, 5 insertions(+), 7 deletions(-)

diff --git a/arch/x86/kvm/lapic.c b/arch/x86/kvm/lapic.c
index d8258a0..ee0f76c 100644
--- a/arch/x86/kvm/lapic.c
+++ b/arch/x86/kvm/lapic.c
@@ -1107,13 +1107,11 @@ int kvm_apic_accept_pic_intr(struct kvm_vcpu *vcpu)
 	u32 lvt0 = apic_get_reg(vcpu->arch.apic, APIC_LVT0);
 	int r = 0;
 
-	if (kvm_vcpu_is_bsp(vcpu)) {
-		if (!apic_hw_enabled(vcpu->arch.apic))
-			r = 1;
-		if ((lvt0 & APIC_LVT_MASKED) == 0 &&
-		    GET_APIC_DELIVERY_MODE(lvt0) == APIC_MODE_EXTINT)
-			r = 1;
-	}
+	if (!apic_hw_enabled(vcpu->arch.apic))
+		r = 1;
+	if ((lvt0 & APIC_LVT_MASKED) == 0 &&
+	    GET_APIC_DELIVERY_MODE(lvt0) == APIC_MODE_EXTINT)
+		r = 1;
 	return r;
 }
 
-- 
1.6.6.1


^ permalink raw reply related	[flat|nested] 20+ messages in thread

* [RFC][PATCH 3/3] In DM_LOWEST, only deliver interrupts to vcpus with enabled LAPIC's
  2010-06-08 17:55 [RFC][PATCH 0/3]: Fixes to IRQ routing Chris Lalancette
  2010-06-08 17:55 ` [RFC][PATCH 1/3] Introduce a workqueue to deliver PIT timer interrupts Chris Lalancette
  2010-06-08 17:55 ` [RFC][PATCH 2/3] Allow any LAPIC to accept PIC interrupts Chris Lalancette
@ 2010-06-08 17:55 ` Chris Lalancette
  2010-06-09 11:01   ` Gleb Natapov
  2010-06-09 11:02 ` [RFC][PATCH 0/3]: Fixes to IRQ routing Avi Kivity
  3 siblings, 1 reply; 20+ messages in thread
From: Chris Lalancette @ 2010-06-08 17:55 UTC (permalink / raw)
  To: kvm; +Cc: avi, mtosatti, Chris Lalancette

Otherwise we might try to deliver a timer interrupt to a cpu that
can't possibly handle it.

Signed-off-by: Chris Lalancette <clalance@redhat.com>
---
 virt/kvm/irq_comm.c |    2 +-
 1 files changed, 1 insertions(+), 1 deletions(-)

diff --git a/virt/kvm/irq_comm.c b/virt/kvm/irq_comm.c
index 52f412f..06cf61e 100644
--- a/virt/kvm/irq_comm.c
+++ b/virt/kvm/irq_comm.c
@@ -100,7 +100,7 @@ int kvm_irq_delivery_to_apic(struct kvm *kvm, struct kvm_lapic *src,
 			if (r < 0)
 				r = 0;
 			r += kvm_apic_set_irq(vcpu, irq);
-		} else {
+		} else if (kvm_lapic_enabled(vcpu)) {
 			if (!lowest)
 				lowest = vcpu;
 			else if (kvm_apic_compare_prio(vcpu, lowest) < 0)
-- 
1.6.6.1


^ permalink raw reply related	[flat|nested] 20+ messages in thread

* Re: [RFC][PATCH 1/3] Introduce a workqueue to deliver PIT timer interrupts.
  2010-06-08 17:55 ` [RFC][PATCH 1/3] Introduce a workqueue to deliver PIT timer interrupts Chris Lalancette
@ 2010-06-09  2:36   ` Zachary Amsden
  2010-06-09 13:23     ` Marcelo Tosatti
  2010-06-09 10:55   ` Avi Kivity
  2010-06-09 13:49   ` Marcelo Tosatti
  2 siblings, 1 reply; 20+ messages in thread
From: Zachary Amsden @ 2010-06-09  2:36 UTC (permalink / raw)
  To: Chris Lalancette; +Cc: kvm, avi, mtosatti

On 06/08/2010 07:55 AM, Chris Lalancette wrote:
> We really want to "kvm_set_irq" during the hrtimer callback,
> but that is risky because that is during interrupt context.
> Instead, offload the work to a workqueue, which is a bit safer
> and should provide most of the same functionality.
>
> Signed-off-by: Chris Lalancette<clalance@redhat.com>
> ---
>   arch/x86/kvm/i8254.c |  106 ++++++++++++++++++++++++++++++-------------------
>   arch/x86/kvm/i8254.h |    4 ++
>   arch/x86/kvm/irq.c   |    1 -
>   arch/x86/kvm/x86.c   |    4 ++
>   4 files changed, 73 insertions(+), 42 deletions(-)
>
> diff --git a/arch/x86/kvm/i8254.c b/arch/x86/kvm/i8254.c
> index 188d827..dd655ef 100644
> --- a/arch/x86/kvm/i8254.c
> +++ b/arch/x86/kvm/i8254.c
> @@ -34,6 +34,7 @@
>
>   #include<linux/kvm_host.h>
>   #include<linux/slab.h>
> +#include<linux/workqueue.h>
>
>   #include "irq.h"
>   #include "i8254.h"
> @@ -49,6 +50,8 @@
>   #define RW_STATE_WORD0 3
>   #define RW_STATE_WORD1 4
>
> +static struct workqueue_struct *pit_wq;
> +
>   /* Compute with 96 bit intermediate result: (a*b)/c */
>   static u64 muldiv64(u64 a, u32 b, u32 c)
>   {
> @@ -281,6 +284,58 @@ static struct kvm_timer_ops kpit_ops = {
>   	.is_periodic = kpit_is_periodic,
>   };
>
> +static void pit_do_work(struct work_struct *work)
> +{
> +       struct kvm_pit *pit = container_of(work, struct kvm_pit, expired);
> +       struct kvm *kvm = pit->kvm;
> +       struct kvm_vcpu *vcpu;
> +       int i;
> +       struct kvm_kpit_state *ps =&pit->pit_state;
> +       int inject = 0;
> +
> +       /* Try to inject pending interrupts when
> +        * last one has been acked.
> +        */
> +       raw_spin_lock(&ps->inject_lock);
> +       if (ps->irq_ack) {
> +               ps->irq_ack = 0;
> +               inject = 1;
> +       }
> +       raw_spin_unlock(&ps->inject_lock);
> +       if (inject) {
> +               kvm_set_irq(kvm, kvm->arch.vpit->irq_source_id, 0, 1);
> +               kvm_set_irq(kvm, kvm->arch.vpit->irq_source_id, 0, 0);
> +
> +               /*
> +                * Provides NMI watchdog support via Virtual Wire mode.
> +                * The route is: PIT ->  PIC ->  LVT0 in NMI mode.
> +                *
> +                * Note: Our Virtual Wire implementation is simplified, only
> +                * propagating PIT interrupts to all VCPUs when they have set
> +                * LVT0 to NMI delivery. Other PIC interrupts are just sent to
> +                * VCPU0, and only if its LVT0 is in EXTINT mode.
> +                */
> +               if (kvm->arch.vapics_in_nmi_mode>  0)
> +                       kvm_for_each_vcpu(i, vcpu, kvm)
> +                               kvm_apic_nmi_wd_deliver(vcpu);
> +       }
> +}
> +
> +static enum hrtimer_restart pit_timer_fn(struct hrtimer *data)
> +{
> +       struct kvm_timer *ktimer = container_of(data, struct kvm_timer, timer);
> +       struct kvm_pit *pt = ktimer->kvm->arch.vpit;
> +
> +       queue_work(pit_wq,&pt->expired);
> +
> +       if (ktimer->t_ops->is_periodic(ktimer)) {
> +               hrtimer_add_expires_ns(&ktimer->timer, ktimer->period);
> +              return HRTIMER_RESTART;
> +       }
> +       else
> +              return HRTIMER_NORESTART;
> +}
> +
>   static void create_pit_timer(struct kvm_kpit_state *ps, u32 val, int is_period)
>   {
>   	struct kvm_timer *pt =&ps->pit_timer;
> @@ -295,10 +350,9 @@ static void create_pit_timer(struct kvm_kpit_state *ps, u32 val, int is_period)
>   	pt->period = interval;
>   	ps->is_periodic = is_period;
>
> -	pt->timer.function = kvm_timer_fn;
> +	pt->timer.function = pit_timer_fn;
>
>    

I am happy to see this.  I thought kvm_timer_fn was a step backwards; it 
was too general of a function to justify the savings of 20 some odd 
lines of code.

Is there any chance that using a workqueue might help the problem of 
hrtimers firing too quickly?  I wanted to return HR_NORESTART from 
pit_timer_fn always, then restart the hrtimer on delivery, but because 
of unreliable delivery, it wasn't clear how to do that.

Perhaps the workqueue can be used to restart the timer instead, avoiding 
problems of impossibly small timeouts causing hrtimers to run amok.

Zach

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [RFC][PATCH 1/3] Introduce a workqueue to deliver PIT timer interrupts.
  2010-06-08 17:55 ` [RFC][PATCH 1/3] Introduce a workqueue to deliver PIT timer interrupts Chris Lalancette
  2010-06-09  2:36   ` Zachary Amsden
@ 2010-06-09 10:55   ` Avi Kivity
  2010-06-09 13:49   ` Marcelo Tosatti
  2 siblings, 0 replies; 20+ messages in thread
From: Avi Kivity @ 2010-06-09 10:55 UTC (permalink / raw)
  To: Chris Lalancette; +Cc: kvm, mtosatti

On 06/08/2010 08:55 PM, Chris Lalancette wrote:
> We really want to "kvm_set_irq" during the hrtimer callback,
> but that is risky because that is during interrupt context.
> Instead, offload the work to a workqueue, which is a bit safer
> and should provide most of the same functionality.
>
>
>
>
> +static void pit_do_work(struct work_struct *work)
> +{
> +       struct kvm_pit *pit = container_of(work, struct kvm_pit, expired);
> +       struct kvm *kvm = pit->kvm;
> +       struct kvm_vcpu *vcpu;
> +       int i;
> +       struct kvm_kpit_state *ps =&pit->pit_state;
> +       int inject = 0;
> +
> +       /* Try to inject pending interrupts when
> +        * last one has been acked.
> +        */
> +       raw_spin_lock(&ps->inject_lock);
> +       if (ps->irq_ack) {
> +               ps->irq_ack = 0;
> +               inject = 1;
> +       }
> +       raw_spin_unlock(&ps->inject_lock);
>    

I don't think this needs to be a raw spinlock any longer (a nice side 
effect).+
> +static enum hrtimer_restart pit_timer_fn(struct hrtimer *data)
> +{
> +       struct kvm_timer *ktimer = container_of(data, struct kvm_timer, timer);
> +       struct kvm_pit *pt = ktimer->kvm->arch.vpit;
> +
> +       queue_work(pit_wq,&pt->expired);
> +
> +       if (ktimer->t_ops->is_periodic(ktimer)) {
> +               hrtimer_add_expires_ns(&ktimer->timer, ktimer->period);
>    

spaces/tabs

> +              return HRTIMER_RESTART;
> +       }
> +       else
> +              return HRTIMER_NORESTART;
> +}
> +
>    

Somewhere, you have to cancel a pending work struct.  At least during 
destruction, but perhaps also if the pit is configured not to generate 
interrupts any more.

-- 
error compiling committee.c: too many arguments to function


^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [RFC][PATCH 3/3] In DM_LOWEST, only deliver interrupts to vcpus with enabled LAPIC's
  2010-06-08 17:55 ` [RFC][PATCH 3/3] In DM_LOWEST, only deliver interrupts to vcpus with enabled LAPIC's Chris Lalancette
@ 2010-06-09 11:01   ` Gleb Natapov
  2010-06-09 11:08     ` Avi Kivity
  0 siblings, 1 reply; 20+ messages in thread
From: Gleb Natapov @ 2010-06-09 11:01 UTC (permalink / raw)
  To: Chris Lalancette; +Cc: kvm, avi, mtosatti

On Tue, Jun 08, 2010 at 01:55:03PM -0400, Chris Lalancette wrote:
> Otherwise we might try to deliver a timer interrupt to a cpu that
> can't possibly handle it.
> 
> Signed-off-by: Chris Lalancette <clalance@redhat.com>
> ---
>  virt/kvm/irq_comm.c |    2 +-
>  1 files changed, 1 insertions(+), 1 deletions(-)
> 
> diff --git a/virt/kvm/irq_comm.c b/virt/kvm/irq_comm.c
> index 52f412f..06cf61e 100644
> --- a/virt/kvm/irq_comm.c
> +++ b/virt/kvm/irq_comm.c
> @@ -100,7 +100,7 @@ int kvm_irq_delivery_to_apic(struct kvm *kvm, struct kvm_lapic *src,
>  			if (r < 0)
>  				r = 0;
>  			r += kvm_apic_set_irq(vcpu, irq);
> -		} else {
> +		} else if (kvm_lapic_enabled(vcpu)) {
>  			if (!lowest)
>  				lowest = vcpu;
>  			else if (kvm_apic_compare_prio(vcpu, lowest) < 0)
Shouldn't we check kvm_lapic_enabled(vcpu) at the beginning of the loop?
Something like:
   if (!kvm_apic_present(vcpu) || !kvm_lapic_enabled(vcpu))
	continue;

--
			Gleb.

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [RFC][PATCH 0/3]: Fixes to IRQ routing
  2010-06-08 17:55 [RFC][PATCH 0/3]: Fixes to IRQ routing Chris Lalancette
                   ` (2 preceding siblings ...)
  2010-06-08 17:55 ` [RFC][PATCH 3/3] In DM_LOWEST, only deliver interrupts to vcpus with enabled LAPIC's Chris Lalancette
@ 2010-06-09 11:02 ` Avi Kivity
  3 siblings, 0 replies; 20+ messages in thread
From: Avi Kivity @ 2010-06-09 11:02 UTC (permalink / raw)
  To: Chris Lalancette; +Cc: kvm, mtosatti

On 06/08/2010 08:55 PM, Chris Lalancette wrote:
> As we've discussed previously, here is a series of patches to
> fix some of the IRQ routing issues we have in KVM.  With this series
> in place I was able to successfully kdump a RHEL-5 64-bit, and RHEL-6
> 32- and 64-bit guest on CPU's other than the BSP.  RHEL-5 32-bit kdump still
> does not work; it gets stuck on "Checking 'hlt' instruction".  However,
> it does that both before and after this series, so there is something
> else going on there that I still have to debug.
>
> I also need to change the "kvm_migrate_pit_timer" function to migrate the
> timer over to the last CPU that handled the timer interrupt, on the
> theory that that particlar CPU is likely to handle the timer interrupt again
> in the near future.
>
> The other outstanding question about these patches is how to setup the
> workqueue.  As it stands I have a single workqueue for all guests.  However,
> for efficiency reasons we may want to have one workqueue per guest.  Opinions?
>    

Perhaps a singlethreaded workqueue per guest?  Or perhaps just 
schedule_work(), and let the kernel worry about the details?


-- 
error compiling committee.c: too many arguments to function


^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [RFC][PATCH 3/3] In DM_LOWEST, only deliver interrupts to vcpus with enabled LAPIC's
  2010-06-09 11:01   ` Gleb Natapov
@ 2010-06-09 11:08     ` Avi Kivity
  2010-06-09 11:20       ` Gleb Natapov
  0 siblings, 1 reply; 20+ messages in thread
From: Avi Kivity @ 2010-06-09 11:08 UTC (permalink / raw)
  To: Gleb Natapov; +Cc: Chris Lalancette, kvm, mtosatti

On 06/09/2010 02:01 PM, Gleb Natapov wrote:
> On Tue, Jun 08, 2010 at 01:55:03PM -0400, Chris Lalancette wrote:
>    
>> Otherwise we might try to deliver a timer interrupt to a cpu that
>> can't possibly handle it.
>>
>> Signed-off-by: Chris Lalancette<clalance@redhat.com>
>> ---
>>   virt/kvm/irq_comm.c |    2 +-
>>   1 files changed, 1 insertions(+), 1 deletions(-)
>>
>> diff --git a/virt/kvm/irq_comm.c b/virt/kvm/irq_comm.c
>> index 52f412f..06cf61e 100644
>> --- a/virt/kvm/irq_comm.c
>> +++ b/virt/kvm/irq_comm.c
>> @@ -100,7 +100,7 @@ int kvm_irq_delivery_to_apic(struct kvm *kvm, struct kvm_lapic *src,
>>   			if (r<  0)
>>   				r = 0;
>>   			r += kvm_apic_set_irq(vcpu, irq);
>> -		} else {
>> +		} else if (kvm_lapic_enabled(vcpu)) {
>>   			if (!lowest)
>>   				lowest = vcpu;
>>   			else if (kvm_apic_compare_prio(vcpu, lowest)<  0)
>>      
> Shouldn't we check kvm_lapic_enabled(vcpu) at the beginning of the loop?
> Something like:
>     if (!kvm_apic_present(vcpu) || !kvm_lapic_enabled(vcpu))
> 	continue;
>
>    

The apic still accepts some interrupts even if disabled, so this needs 
to be very conditional.

-- 
error compiling committee.c: too many arguments to function


^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [RFC][PATCH 3/3] In DM_LOWEST, only deliver interrupts to vcpus with enabled LAPIC's
  2010-06-09 11:08     ` Avi Kivity
@ 2010-06-09 11:20       ` Gleb Natapov
  2010-06-09 11:22         ` Avi Kivity
  0 siblings, 1 reply; 20+ messages in thread
From: Gleb Natapov @ 2010-06-09 11:20 UTC (permalink / raw)
  To: Avi Kivity; +Cc: Chris Lalancette, kvm, mtosatti

On Wed, Jun 09, 2010 at 02:08:31PM +0300, Avi Kivity wrote:
> On 06/09/2010 02:01 PM, Gleb Natapov wrote:
> >On Tue, Jun 08, 2010 at 01:55:03PM -0400, Chris Lalancette wrote:
> >>Otherwise we might try to deliver a timer interrupt to a cpu that
> >>can't possibly handle it.
> >>
> >>Signed-off-by: Chris Lalancette<clalance@redhat.com>
> >>---
> >>  virt/kvm/irq_comm.c |    2 +-
> >>  1 files changed, 1 insertions(+), 1 deletions(-)
> >>
> >>diff --git a/virt/kvm/irq_comm.c b/virt/kvm/irq_comm.c
> >>index 52f412f..06cf61e 100644
> >>--- a/virt/kvm/irq_comm.c
> >>+++ b/virt/kvm/irq_comm.c
> >>@@ -100,7 +100,7 @@ int kvm_irq_delivery_to_apic(struct kvm *kvm, struct kvm_lapic *src,
> >>  			if (r<  0)
> >>  				r = 0;
> >>  			r += kvm_apic_set_irq(vcpu, irq);
> >>-		} else {
> >>+		} else if (kvm_lapic_enabled(vcpu)) {
> >>  			if (!lowest)
> >>  				lowest = vcpu;
> >>  			else if (kvm_apic_compare_prio(vcpu, lowest)<  0)
> >Shouldn't we check kvm_lapic_enabled(vcpu) at the beginning of the loop?
> >Something like:
> >    if (!kvm_apic_present(vcpu) || !kvm_lapic_enabled(vcpu))
> >	continue;
> >
> 
> The apic still accepts some interrupts even if disabled, so this
> needs to be very conditional.
> 
What kind of interrupt and can they be delivered in DM_LOWEST mode?

--
			Gleb.

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [RFC][PATCH 3/3] In DM_LOWEST, only deliver interrupts to vcpus with enabled LAPIC's
  2010-06-09 11:20       ` Gleb Natapov
@ 2010-06-09 11:22         ` Avi Kivity
  0 siblings, 0 replies; 20+ messages in thread
From: Avi Kivity @ 2010-06-09 11:22 UTC (permalink / raw)
  To: Gleb Natapov; +Cc: Chris Lalancette, kvm, mtosatti

On 06/09/2010 02:20 PM, Gleb Natapov wrote:
>
>> The apic still accepts some interrupts even if disabled, so this
>> needs to be very conditional.
>>
>>      
> What kind of interrupt and can they be delivered in DM_LOWEST mode?
>    

INIT, NMI, SMI, and SIPI.  So no DM_LOWEST.

-- 
error compiling committee.c: too many arguments to function


^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [RFC][PATCH 1/3] Introduce a workqueue to deliver PIT timer interrupts.
  2010-06-09  2:36   ` Zachary Amsden
@ 2010-06-09 13:23     ` Marcelo Tosatti
  2010-06-09 14:16       ` Avi Kivity
  2010-06-09 21:17       ` Zachary Amsden
  0 siblings, 2 replies; 20+ messages in thread
From: Marcelo Tosatti @ 2010-06-09 13:23 UTC (permalink / raw)
  To: Zachary Amsden; +Cc: Chris Lalancette, kvm, avi

On Tue, Jun 08, 2010 at 04:36:16PM -1000, Zachary Amsden wrote:
> >+	pt->timer.function = pit_timer_fn;
> >
> 
> I am happy to see this.  I thought kvm_timer_fn was a step
> backwards; it was too general of a function to justify the savings
> of 20 some odd lines of code.

This was not done to save 20 lines of code:

http://www.mail-archive.com/kvm@vger.kernel.org/msg18640.html

> Is there any chance that using a workqueue might help the problem of
> hrtimers firing too quickly?  I wanted to return HR_NORESTART from
> pit_timer_fn always, then restart the hrtimer on delivery, but
> because of unreliable delivery, it wasn't clear how to do that.
>
> Perhaps the workqueue can be used to restart the timer instead,
> avoiding problems of impossibly small timeouts causing hrtimers to
> run amok.

It should be rearmed on ACK ideally. How come delivery is unreliable?

BTW, a massive amount of hrtimers (_single_ non-tickless 32-vcpu guest
with LAPIC) results in:

hrtimer: interrupt took 122448 ns

in the host. So don't even need impossibly small timeouts :(

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [RFC][PATCH 1/3] Introduce a workqueue to deliver PIT timer interrupts.
  2010-06-08 17:55 ` [RFC][PATCH 1/3] Introduce a workqueue to deliver PIT timer interrupts Chris Lalancette
  2010-06-09  2:36   ` Zachary Amsden
  2010-06-09 10:55   ` Avi Kivity
@ 2010-06-09 13:49   ` Marcelo Tosatti
  2010-06-09 14:20     ` Avi Kivity
  2 siblings, 1 reply; 20+ messages in thread
From: Marcelo Tosatti @ 2010-06-09 13:49 UTC (permalink / raw)
  To: Chris Lalancette; +Cc: kvm, avi

On Tue, Jun 08, 2010 at 01:55:01PM -0400, Chris Lalancette wrote:
> We really want to "kvm_set_irq" during the hrtimer callback,
> but that is risky because that is during interrupt context.
> Instead, offload the work to a workqueue, which is a bit safer
> and should provide most of the same functionality.
> 
> Signed-off-by: Chris Lalancette <clalance@redhat.com>
> ---
>  arch/x86/kvm/i8254.c |  106 ++++++++++++++++++++++++++++++-------------------
>  arch/x86/kvm/i8254.h |    4 ++
>  arch/x86/kvm/irq.c   |    1 -
>  arch/x86/kvm/x86.c   |    4 ++
>  4 files changed, 73 insertions(+), 42 deletions(-)
> 
> diff --git a/arch/x86/kvm/i8254.c b/arch/x86/kvm/i8254.c
> index 188d827..dd655ef 100644
> --- a/arch/x86/kvm/i8254.c
> +++ b/arch/x86/kvm/i8254.c
> @@ -34,6 +34,7 @@
>  
>  #include <linux/kvm_host.h>
>  #include <linux/slab.h>
> +#include <linux/workqueue.h>
>  
>  #include "irq.h"
>  #include "i8254.h"
> @@ -49,6 +50,8 @@
>  #define RW_STATE_WORD0 3
>  #define RW_STATE_WORD1 4
>  
> +static struct workqueue_struct *pit_wq;
> +
>  /* Compute with 96 bit intermediate result: (a*b)/c */
>  static u64 muldiv64(u64 a, u32 b, u32 c)
>  {
> @@ -281,6 +284,58 @@ static struct kvm_timer_ops kpit_ops = {
>  	.is_periodic = kpit_is_periodic,
>  };
>  
> +static void pit_do_work(struct work_struct *work)
> +{
> +       struct kvm_pit *pit = container_of(work, struct kvm_pit, expired);
> +       struct kvm *kvm = pit->kvm;
> +       struct kvm_vcpu *vcpu;
> +       int i;
> +       struct kvm_kpit_state *ps = &pit->pit_state;
> +       int inject = 0;
> +
> +       /* Try to inject pending interrupts when
> +        * last one has been acked.
> +        */
> +       raw_spin_lock(&ps->inject_lock);
> +       if (ps->irq_ack) {
> +               ps->irq_ack = 0;
> +               inject = 1;
> +       }
> +       raw_spin_unlock(&ps->inject_lock);
> +       if (inject) {
> +               kvm_set_irq(kvm, kvm->arch.vpit->irq_source_id, 0, 1);
> +               kvm_set_irq(kvm, kvm->arch.vpit->irq_source_id, 0, 0);
> +
> +               /*
> +                * Provides NMI watchdog support via Virtual Wire mode.
> +                * The route is: PIT -> PIC -> LVT0 in NMI mode.
> +                *
> +                * Note: Our Virtual Wire implementation is simplified, only
> +                * propagating PIT interrupts to all VCPUs when they have set
> +                * LVT0 to NMI delivery. Other PIC interrupts are just sent to
> +                * VCPU0, and only if its LVT0 is in EXTINT mode.
> +                */
> +               if (kvm->arch.vapics_in_nmi_mode > 0)
> +                       kvm_for_each_vcpu(i, vcpu, kvm)
> +                               kvm_apic_nmi_wd_deliver(vcpu);
> +       }
> +}
> +
> +static enum hrtimer_restart pit_timer_fn(struct hrtimer *data)
> +{
> +       struct kvm_timer *ktimer = container_of(data, struct kvm_timer, timer);
> +       struct kvm_pit *pt = ktimer->kvm->arch.vpit;
> +
> +       queue_work(pit_wq, &pt->expired);
> +
> +       if (ktimer->t_ops->is_periodic(ktimer)) {
> +               hrtimer_add_expires_ns(&ktimer->timer, ktimer->period);
> +              return HRTIMER_RESTART;
> +       }
> +       else
> +              return HRTIMER_NORESTART;
> +}

Certain guests do not compensate for lost ticks (older RHEL3 for ex),
and this completly reinjection.

Perhaps you can increase the pending counter on pit_timer_fn, and
reinject on ACK?

Also, due to
http://www.mail-archive.com/kvm-devel@lists.sourceforge.net/msg13250.html,
it should be safer to keep PIT locked to vcpu0 if IOAPIC dest mode is 
lowprio.


^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [RFC][PATCH 1/3] Introduce a workqueue to deliver PIT timer interrupts.
  2010-06-09 13:23     ` Marcelo Tosatti
@ 2010-06-09 14:16       ` Avi Kivity
  2010-06-09 14:40         ` Marcelo Tosatti
  2010-06-09 21:17       ` Zachary Amsden
  1 sibling, 1 reply; 20+ messages in thread
From: Avi Kivity @ 2010-06-09 14:16 UTC (permalink / raw)
  To: Marcelo Tosatti; +Cc: Zachary Amsden, Chris Lalancette, kvm

On 06/09/2010 04:23 PM, Marcelo Tosatti wrote:
>
>> Is there any chance that using a workqueue might help the problem of
>> hrtimers firing too quickly?  I wanted to return HR_NORESTART from
>> pit_timer_fn always, then restart the hrtimer on delivery, but
>> because of unreliable delivery, it wasn't clear how to do that.
>>
>> Perhaps the workqueue can be used to restart the timer instead,
>> avoiding problems of impossibly small timeouts causing hrtimers to
>> run amok.
>>      
> It should be rearmed on ACK ideally. How come delivery is unreliable?
>
> BTW, a massive amount of hrtimers (_single_ non-tickless 32-vcpu guest
> with LAPIC) results in:
>
> hrtimer: interrupt took 122448 ns
>
> in the host. So don't even need impossibly small timeouts :(
>    

Any idea where that came from?  Even looping on vcpus, that took 4 usec 
per vcpu.

-- 
error compiling committee.c: too many arguments to function


^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [RFC][PATCH 1/3] Introduce a workqueue to deliver PIT timer interrupts.
  2010-06-09 13:49   ` Marcelo Tosatti
@ 2010-06-09 14:20     ` Avi Kivity
  0 siblings, 0 replies; 20+ messages in thread
From: Avi Kivity @ 2010-06-09 14:20 UTC (permalink / raw)
  To: Marcelo Tosatti; +Cc: Chris Lalancette, kvm

On 06/09/2010 04:49 PM, Marcelo Tosatti wrote:
>
> Certain guests do not compensate for lost ticks (older RHEL3 for ex),
> and this completly reinjection.
>
> Perhaps you can increase the pending counter on pit_timer_fn, and
> reinject on ACK?
>
> Also, due to
> http://www.mail-archive.com/kvm-devel@lists.sourceforge.net/msg13250.html,
> it should be safer to keep PIT locked to vcpu0 if IOAPIC dest mode is
> lowprio.
>
>    

Yes.  We don't actually need to lock the PIT to vcpu 0, instead just 
make sure the lowprio selection picks vcpu 0 for irq 0/2.

-- 
error compiling committee.c: too many arguments to function


^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [RFC][PATCH 1/3] Introduce a workqueue to deliver PIT timer interrupts.
  2010-06-09 14:16       ` Avi Kivity
@ 2010-06-09 14:40         ` Marcelo Tosatti
  0 siblings, 0 replies; 20+ messages in thread
From: Marcelo Tosatti @ 2010-06-09 14:40 UTC (permalink / raw)
  To: Avi Kivity; +Cc: Zachary Amsden, Chris Lalancette, kvm

On Wed, Jun 09, 2010 at 05:16:47PM +0300, Avi Kivity wrote:
> On 06/09/2010 04:23 PM, Marcelo Tosatti wrote:
> >
> >>Is there any chance that using a workqueue might help the problem of
> >>hrtimers firing too quickly?  I wanted to return HR_NORESTART from
> >>pit_timer_fn always, then restart the hrtimer on delivery, but
> >>because of unreliable delivery, it wasn't clear how to do that.
> >>
> >>Perhaps the workqueue can be used to restart the timer instead,
> >>avoiding problems of impossibly small timeouts causing hrtimers to
> >>run amok.
> >It should be rearmed on ACK ideally. How come delivery is unreliable?
> >
> >BTW, a massive amount of hrtimers (_single_ non-tickless 32-vcpu guest
> >with LAPIC) results in:
> >
> >hrtimer: interrupt took 122448 ns
> >
> >in the host. So don't even need impossibly small timeouts :(
> 
> Any idea where that came from?  Even looping on vcpus, that took 4
> usec per vcpu.

The rearming of timers (that emulate LAPIC) cause the host hrtimer
interrupt handler to re-execute its "run expired timers" loop. If timing
is unfortunate, it'll have to do that several times.


^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [RFC][PATCH 1/3] Introduce a workqueue to deliver PIT timer interrupts.
  2010-06-09 13:23     ` Marcelo Tosatti
  2010-06-09 14:16       ` Avi Kivity
@ 2010-06-09 21:17       ` Zachary Amsden
  2010-06-10  9:24         ` Avi Kivity
  2010-06-10 16:18         ` Marcelo Tosatti
  1 sibling, 2 replies; 20+ messages in thread
From: Zachary Amsden @ 2010-06-09 21:17 UTC (permalink / raw)
  To: Marcelo Tosatti; +Cc: Chris Lalancette, kvm, avi

On 06/09/2010 03:23 AM, Marcelo Tosatti wrote:
> On Tue, Jun 08, 2010 at 04:36:16PM -1000, Zachary Amsden wrote:
>    
>>> +	pt->timer.function = pit_timer_fn;
>>>
>>>        
>> I am happy to see this.  I thought kvm_timer_fn was a step
>> backwards; it was too general of a function to justify the savings
>> of 20 some odd lines of code.
>>      
> This was not done to save 20 lines of code:
>
> http://www.mail-archive.com/kvm@vger.kernel.org/msg18640.html
>
>    
>> Is there any chance that using a workqueue might help the problem of
>> hrtimers firing too quickly?  I wanted to return HR_NORESTART from
>> pit_timer_fn always, then restart the hrtimer on delivery, but
>> because of unreliable delivery, it wasn't clear how to do that.
>>
>> Perhaps the workqueue can be used to restart the timer instead,
>> avoiding problems of impossibly small timeouts causing hrtimers to
>> run amok.
>>      
> It should be rearmed on ACK ideally. How come delivery is unreliable?
>    

Due to the various ways IRQ can be latched or not with different 
routing.  If you program a very short timeout, it should fire even if 
the guest is slow to respond:

i..i..i..i..i..i..i
..a...a...a...a

if you don't restart right away, only on ack, you can risk the refire of 
the interrupt getting lost, especially if the guest happens to reprogram 
the PIT while this is happening...

Of course, it can be tracked properly, but it requires quite a few more 
state variables and some careful reasoning to make sure it doesn't drop 
irqs.

This patch series looks as if it will make that sort of reasoning easier 
to do :)

> BTW, a massive amount of hrtimers (_single_ non-tickless 32-vcpu guest
> with LAPIC) results in:
>
> hrtimer: interrupt took 122448 ns
>
> in the host. So don't even need impossibly small timeouts :(
>    
32-vcpu on how many pcpu?

^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [RFC][PATCH 1/3] Introduce a workqueue to deliver PIT timer interrupts.
  2010-06-09 21:17       ` Zachary Amsden
@ 2010-06-10  9:24         ` Avi Kivity
  2010-06-10 19:17           ` Zachary Amsden
  2010-06-10 16:18         ` Marcelo Tosatti
  1 sibling, 1 reply; 20+ messages in thread
From: Avi Kivity @ 2010-06-10  9:24 UTC (permalink / raw)
  To: Zachary Amsden; +Cc: Marcelo Tosatti, Chris Lalancette, kvm

On 06/10/2010 12:17 AM, Zachary Amsden wrote:
> On 06/09/2010 03:23 AM, Marcelo Tosatti wrote:
>> On Tue, Jun 08, 2010 at 04:36:16PM -1000, Zachary Amsden wrote:
>>>> +    pt->timer.function = pit_timer_fn;
>>>>
>>> I am happy to see this.  I thought kvm_timer_fn was a step
>>> backwards; it was too general of a function to justify the savings
>>> of 20 some odd lines of code.
>> This was not done to save 20 lines of code:
>>
>> http://www.mail-archive.com/kvm@vger.kernel.org/msg18640.html
>>
>>> Is there any chance that using a workqueue might help the problem of
>>> hrtimers firing too quickly?  I wanted to return HR_NORESTART from
>>> pit_timer_fn always, then restart the hrtimer on delivery, but
>>> because of unreliable delivery, it wasn't clear how to do that.
>>>
>>> Perhaps the workqueue can be used to restart the timer instead,
>>> avoiding problems of impossibly small timeouts causing hrtimers to
>>> run amok.
>> It should be rearmed on ACK ideally. How come delivery is unreliable?
>
> Due to the various ways IRQ can be latched or not with different 
> routing.  If you program a very short timeout, it should fire even if 
> the guest is slow to respond:
>
> i..i..i..i..i..i..i
> ..a...a...a...a
>
> if you don't restart right away, only on ack, you can risk the refire 
> of the interrupt getting lost, especially if the guest happens to 
> reprogram the PIT while this is happening...
>

Restarting on ack has the big advantage of not rearming at all if the 
guest doesn't ack, i.e. a timer left running talking to a masked irq pin.

-- 
error compiling committee.c: too many arguments to function


^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [RFC][PATCH 1/3] Introduce a workqueue to deliver PIT timer interrupts.
  2010-06-09 21:17       ` Zachary Amsden
  2010-06-10  9:24         ` Avi Kivity
@ 2010-06-10 16:18         ` Marcelo Tosatti
  1 sibling, 0 replies; 20+ messages in thread
From: Marcelo Tosatti @ 2010-06-10 16:18 UTC (permalink / raw)
  To: Zachary Amsden; +Cc: Chris Lalancette, kvm, avi

On Wed, Jun 09, 2010 at 11:17:41AM -1000, Zachary Amsden wrote:
> On 06/09/2010 03:23 AM, Marcelo Tosatti wrote:
> >On Tue, Jun 08, 2010 at 04:36:16PM -1000, Zachary Amsden wrote:
> >>>+	pt->timer.function = pit_timer_fn;
> >>>
> >>I am happy to see this.  I thought kvm_timer_fn was a step
> >>backwards; it was too general of a function to justify the savings
> >>of 20 some odd lines of code.
> >This was not done to save 20 lines of code:
> >
> >http://www.mail-archive.com/kvm@vger.kernel.org/msg18640.html
> >
> >>Is there any chance that using a workqueue might help the problem of
> >>hrtimers firing too quickly?  I wanted to return HR_NORESTART from
> >>pit_timer_fn always, then restart the hrtimer on delivery, but
> >>because of unreliable delivery, it wasn't clear how to do that.
> >>
> >>Perhaps the workqueue can be used to restart the timer instead,
> >>avoiding problems of impossibly small timeouts causing hrtimers to
> >>run amok.
> >It should be rearmed on ACK ideally. How come delivery is unreliable?
> 
> Due to the various ways IRQ can be latched or not with different
> routing.  If you program a very short timeout, it should fire even
> if the guest is slow to respond:
> 
> i..i..i..i..i..i..i
> ..a...a...a...a
> 
> if you don't restart right away, only on ack, you can risk the
> refire of the interrupt getting lost, especially if the guest
> happens to reprogram the PIT while this is happening...
> 
> Of course, it can be tracked properly, but it requires quite a few
> more state variables and some careful reasoning to make sure it
> doesn't drop irqs.
> 
> This patch series looks as if it will make that sort of reasoning
> easier to do :)
> 
> >BTW, a massive amount of hrtimers (_single_ non-tickless 32-vcpu guest
> >with LAPIC) results in:
> >
> >hrtimer: interrupt took 122448 ns
> >
> >in the host. So don't even need impossibly small timeouts :(
> 32-vcpu on how many pcpu?

8


^ permalink raw reply	[flat|nested] 20+ messages in thread

* Re: [RFC][PATCH 1/3] Introduce a workqueue to deliver PIT timer interrupts.
  2010-06-10  9:24         ` Avi Kivity
@ 2010-06-10 19:17           ` Zachary Amsden
  0 siblings, 0 replies; 20+ messages in thread
From: Zachary Amsden @ 2010-06-10 19:17 UTC (permalink / raw)
  To: Avi Kivity; +Cc: Marcelo Tosatti, Chris Lalancette, kvm

On 06/09/2010 11:24 PM, Avi Kivity wrote:
> On 06/10/2010 12:17 AM, Zachary Amsden wrote:
>> On 06/09/2010 03:23 AM, Marcelo Tosatti wrote:
>>> On Tue, Jun 08, 2010 at 04:36:16PM -1000, Zachary Amsden wrote:
>>>>> +    pt->timer.function = pit_timer_fn;
>>>>>
>>>> I am happy to see this.  I thought kvm_timer_fn was a step
>>>> backwards; it was too general of a function to justify the savings
>>>> of 20 some odd lines of code.
>>> This was not done to save 20 lines of code:
>>>
>>> http://www.mail-archive.com/kvm@vger.kernel.org/msg18640.html
>>>
>>>> Is there any chance that using a workqueue might help the problem of
>>>> hrtimers firing too quickly?  I wanted to return HR_NORESTART from
>>>> pit_timer_fn always, then restart the hrtimer on delivery, but
>>>> because of unreliable delivery, it wasn't clear how to do that.
>>>>
>>>> Perhaps the workqueue can be used to restart the timer instead,
>>>> avoiding problems of impossibly small timeouts causing hrtimers to
>>>> run amok.
>>> It should be rearmed on ACK ideally. How come delivery is unreliable?
>>
>> Due to the various ways IRQ can be latched or not with different 
>> routing.  If you program a very short timeout, it should fire even if 
>> the guest is slow to respond:
>>
>> i..i..i..i..i..i..i
>> ..a...a...a...a
>>
>> if you don't restart right away, only on ack, you can risk the refire 
>> of the interrupt getting lost, especially if the guest happens to 
>> reprogram the PIT while this is happening...
>>
>
> Restarting on ack has the big advantage of not rearming at all if the 
> guest doesn't ack, i.e. a timer left running talking to a masked irq pin.
>

Exactly... it scales too, with how fast we can manage to deliver the 
interrupts.  We just have to be sure we don't drop the re-fire 
completely in any race conditions.  Some stress will be required.

Zach

^ permalink raw reply	[flat|nested] 20+ messages in thread

end of thread, other threads:[~2010-06-10 19:17 UTC | newest]

Thread overview: 20+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2010-06-08 17:55 [RFC][PATCH 0/3]: Fixes to IRQ routing Chris Lalancette
2010-06-08 17:55 ` [RFC][PATCH 1/3] Introduce a workqueue to deliver PIT timer interrupts Chris Lalancette
2010-06-09  2:36   ` Zachary Amsden
2010-06-09 13:23     ` Marcelo Tosatti
2010-06-09 14:16       ` Avi Kivity
2010-06-09 14:40         ` Marcelo Tosatti
2010-06-09 21:17       ` Zachary Amsden
2010-06-10  9:24         ` Avi Kivity
2010-06-10 19:17           ` Zachary Amsden
2010-06-10 16:18         ` Marcelo Tosatti
2010-06-09 10:55   ` Avi Kivity
2010-06-09 13:49   ` Marcelo Tosatti
2010-06-09 14:20     ` Avi Kivity
2010-06-08 17:55 ` [RFC][PATCH 2/3] Allow any LAPIC to accept PIC interrupts Chris Lalancette
2010-06-08 17:55 ` [RFC][PATCH 3/3] In DM_LOWEST, only deliver interrupts to vcpus with enabled LAPIC's Chris Lalancette
2010-06-09 11:01   ` Gleb Natapov
2010-06-09 11:08     ` Avi Kivity
2010-06-09 11:20       ` Gleb Natapov
2010-06-09 11:22         ` Avi Kivity
2010-06-09 11:02 ` [RFC][PATCH 0/3]: Fixes to IRQ routing Avi Kivity

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.