All of lore.kernel.org
 help / color / mirror / Atom feed
* [Xenomai-core] [RFC] KVM over Xenomai and I-pipe
@ 2010-03-12 15:25 Jan Kiszka
  2010-04-23 11:04 ` Philippe Gerum
  0 siblings, 1 reply; 8+ messages in thread
From: Jan Kiszka @ 2010-03-12 15:25 UTC (permalink / raw)
  To: xenomai-core; +Cc: xenomai-help, adeos-main

[-- Attachment #1: Type: text/plain, Size: 1736 bytes --]

Hi,

this is still in the state "study", but it is working fairly nicely so far:

These two patches harden latest KVM for use over I-pipe kernels and make
Xenomai aware of the lazy host state restoring that KVM uses for
performance reasons. The latter basically means calling the sched-out
notifier that KVM registers with the kernel when switching from a Linux
task to some shadow. This is safe in all recent versions of KVM and
still gives nice KVM performance (that of KVM before 2.6.32) without
significant impact on the RT latency (Note: if you have an old VT-x CPU,
guest-issued wbinvd will ruin RT as it is not intercepted by the hardware!).

To test it, you need to apply the kernel patch on top of current kvm.git
master [1], obtain kvm-kmod.git [2], run configure on it (assuming your
host kernel is a Xenomai one, otherwise use --kerneldir) and then "make
sync-kmod LINUX=/path/to/kvm.git". After a final make && make install,
you will have recent kvm modules that are I-pipe aware. The Xenomai
patch simply appies to the 2.5 tree. This has been tested with
ipipe-2.6.32-x86-2.6-01 + [3] and Xenomai-2.5 git.

Feedback welcome, specifically if you think it's worth integrating both
patches into upstream. The kernel bits would make sense over some
2.6.33-x86, but additional work will be required to account for the
user-return notifiers introduced with that release (kvm-kmod currently
wraps them away for older kernels).

Jan

[1]git://git.kernel.org/pub/scm/virt/kvm/kvm.git
[2]git://git.kernel.org/pub/scm/virt/kvm/kvm-kmod.git
[3]http://git.kiszka.org/?p=ipipe-2.6.git;a=commitdiff;h=0bddff1716aba6dd5ca11627ee377a5a25fa3dae

-- 
Siemens AG, Corporate Technology, CT T DE IT 1
Corporate Competence Center Embedded Linux

[-- Attachment #2: 0001-Harden-KVM-for-use-over-I-pipe.patch --]
[-- Type: text/x-patch, Size: 3358 bytes --]

>From 55480e98b8f35818a838bb2bd1f24764276a9b17 Mon Sep 17 00:00:00 2001
From: Jan Kiszka <jan.kiszka@domain.hid>
Date: Wed, 10 Mar 2010 08:32:02 +0100
Subject: [PATCH] Harden KVM for use over I-pipe

This allows to use KVM on I-pipe-enabled kernels. I-pipe domains that
preempt a VCPU task and let the preempting task return to user space
additionally have to fire the sched-out notifiers of VCPU task with IRQs
disable. Those will restore host states that are lazily switched for
performance reasons.

Tested with modfied Xenomai on Intel, should work with AMD hosts as
well.

Signed-off-by: Jan Kiszka <jan.kiszka@domain.hid>
---
 arch/x86/kvm/svm.c |    4 ++--
 arch/x86/kvm/vmx.c |   10 ++++++----
 arch/x86/kvm/x86.c |   11 +++++++++--
 3 files changed, 17 insertions(+), 8 deletions(-)

diff --git a/arch/x86/kvm/svm.c b/arch/x86/kvm/svm.c
index def4877..7e81639 100644
--- a/arch/x86/kvm/svm.c
+++ b/arch/x86/kvm/svm.c
@@ -3025,7 +3025,7 @@ static void svm_vcpu_run(struct kvm_vcpu *vcpu)
 
 	clgi();
 
-	local_irq_enable();
+	local_irq_enable_hw();
 
 	asm volatile (
 		"push %%"R"bp; \n\t"
@@ -3110,7 +3110,7 @@ static void svm_vcpu_run(struct kvm_vcpu *vcpu)
 
 	reload_tss(vcpu);
 
-	local_irq_disable();
+	local_irq_disable_hw();
 
 	stgi();
 
diff --git a/arch/x86/kvm/vmx.c b/arch/x86/kvm/vmx.c
index ae3217d..741e2a1 100644
--- a/arch/x86/kvm/vmx.c
+++ b/arch/x86/kvm/vmx.c
@@ -736,12 +736,12 @@ static void __vmx_load_host_state(struct vcpu_vmx *vmx)
 		 * If we have to reload gs, we must take care to
 		 * preserve our gs base.
 		 */
-		local_irq_save(flags);
+		local_irq_save_hw(flags);
 		kvm_load_gs(vmx->host_state.gs_sel);
 #ifdef CONFIG_X86_64
 		wrmsrl(MSR_GS_BASE, vmcs_readl(HOST_GS_BASE));
 #endif
-		local_irq_restore(flags);
+		local_irq_restore_hw(flags);
 	}
 	reload_tss();
 #ifdef CONFIG_X86_64
@@ -754,9 +754,11 @@ static void __vmx_load_host_state(struct vcpu_vmx *vmx)
 
 static void vmx_load_host_state(struct vcpu_vmx *vmx)
 {
-	preempt_disable();
+	unsigned long flags;
+
+	ipipe_preempt_disable(flags);
 	__vmx_load_host_state(vmx);
-	preempt_enable();
+	ipipe_preempt_enable(flags);
 }
 
 /*
diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index 703f637..713a392 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -1657,8 +1657,12 @@ void kvm_arch_vcpu_load(struct kvm_vcpu *vcpu, int cpu)
 
 void kvm_arch_vcpu_put(struct kvm_vcpu *vcpu)
 {
+	unsigned long flags;
+
+	local_irq_save_hw_cond(flags);
 	kvm_put_guest_fpu(vcpu);
 	kvm_x86_ops->vcpu_put(vcpu);
+	local_irq_restore_hw_cond(flags);
 }
 
 static int is_efer_nx(void)
@@ -4332,17 +4336,19 @@ static int vcpu_enter_guest(struct kvm_vcpu *vcpu)
 
 	preempt_disable();
 
+	local_irq_disable();
+	local_irq_disable_hw();
+
 	kvm_x86_ops->prepare_guest_switch(vcpu);
 	if (vcpu->fpu_active)
 		kvm_load_guest_fpu(vcpu);
 
-	local_irq_disable();
-
 	clear_bit(KVM_REQ_KICK, &vcpu->requests);
 	smp_mb__after_clear_bit();
 
 	if (vcpu->requests || need_resched() || signal_pending(current)) {
 		set_bit(KVM_REQ_KICK, &vcpu->requests);
+		local_irq_enable_hw();
 		local_irq_enable();
 		preempt_enable();
 		r = 1;
@@ -4388,6 +4394,7 @@ static int vcpu_enter_guest(struct kvm_vcpu *vcpu)
 		hw_breakpoint_restore();
 
 	set_bit(KVM_REQ_KICK, &vcpu->requests);
+	local_irq_enable_hw();
 	local_irq_enable();
 
 	++vcpu->stat.exits;
-- 
1.6.0.2


[-- Attachment #3: 0001-Enable-KVM-on-Xenomai-kernels.patch --]
[-- Type: text/x-patch, Size: 1277 bytes --]

>From 618e445548d38f712c4d5d108da627fd30207631 Mon Sep 17 00:00:00 2001
From: Jan Kiszka <jan.kiszka@domain.hid>
Date: Wed, 10 Mar 2010 10:35:36 +0100
Subject: [PATCH] Enable KVM on Xenomai kernels

Call the sched-out notifier (so far this can only be kvm_sched_out) when
switching Linux tasks. This restores the complete host state after a VM
exit and allows shadow threads to safely preempt VCPU threads.

Signed-off-by: Jan Kiszka <jan.kiszka@domain.hid>
---
 include/asm-x86/bits/pod_64.h |    9 +++++++++
 1 files changed, 9 insertions(+), 0 deletions(-)

diff --git a/include/asm-x86/bits/pod_64.h b/include/asm-x86/bits/pod_64.h
index 88e049d..96793f7 100644
--- a/include/asm-x86/bits/pod_64.h
+++ b/include/asm-x86/bits/pod_64.h
@@ -68,6 +68,15 @@ static inline void xnarch_switch_to(xnarchtcb_t *out_tcb, xnarchtcb_t *in_tcb)
 	struct task_struct *next = in_tcb->user_task;
 
 	if (likely(next != NULL)) {
+#ifdef CONFIG_PREEMPT_NOTIFIERS
+		struct preempt_notifier *notifier;
+		struct hlist_node *node;
+
+		hlist_for_each_entry(notifier, node, &prev->preempt_notifiers,
+				     link)
+			notifier->ops->sched_out(notifier, next);
+#endif
+
 		if (task_thread_info(prev)->status & TS_USEDFPU)
 			/*
 			 * __switch_to will try and use __unlazy_fpu,
-- 
1.6.0.2


^ permalink raw reply related	[flat|nested] 8+ messages in thread

* Re: [Xenomai-core] [RFC] KVM over Xenomai and I-pipe
  2010-03-12 15:25 [Xenomai-core] [RFC] KVM over Xenomai and I-pipe Jan Kiszka
@ 2010-04-23 11:04 ` Philippe Gerum
  2010-04-23 12:15   ` Jan Kiszka
  0 siblings, 1 reply; 8+ messages in thread
From: Philippe Gerum @ 2010-04-23 11:04 UTC (permalink / raw)
  To: Jan Kiszka; +Cc: xenomai-help, adeos-main, xenomai-core

On Fri, 2010-03-12 at 16:25 +0100, Jan Kiszka wrote:
> Hi,
> 
> this is still in the state "study", but it is working fairly nicely so far:
> 
> These two patches harden latest KVM for use over I-pipe kernels and make
> Xenomai aware of the lazy host state restoring that KVM uses for
> performance reasons. The latter basically means calling the sched-out
> notifier that KVM registers with the kernel when switching from a Linux
> task to some shadow. This is safe in all recent versions of KVM and
> still gives nice KVM performance (that of KVM before 2.6.32) without
> significant impact on the RT latency (Note: if you have an old VT-x CPU,
> guest-issued wbinvd will ruin RT as it is not intercepted by the hardware!).
> 
> To test it, you need to apply the kernel patch on top of current kvm.git
> master [1], obtain kvm-kmod.git [2], run configure on it (assuming your
> host kernel is a Xenomai one, otherwise use --kerneldir) and then "make
> sync-kmod LINUX=/path/to/kvm.git". After a final make && make install,
> you will have recent kvm modules that are I-pipe aware. The Xenomai
> patch simply appies to the 2.5 tree. This has been tested with
> ipipe-2.6.32-x86-2.6-01 + [3] and Xenomai-2.5 git.
> 
> Feedback welcome, specifically if you think it's worth integrating both
> patches into upstream. The kernel bits would make sense over some
> 2.6.33-x86, but additional work will be required to account for the
> user-return notifiers introduced with that release (kvm-kmod currently
> wraps them away for older kernels).

No concern on the final goal, running a Xenomai-enabled kernel
rock-solid over KVM is a must.

The KVM code ironing from the 1st patch looks fine to me, no big deal to
maintain AFAICS. I would be only concerned by the 2nd patch,
specifically how the KVM callout is invoked from the Xenomai context
switching code:

- depending on CONFIG_PREEMPT_NOTIFIERS is much broader than required; I
guess that CONFIG_KVM would be enough.

- calling the KVM callout directly instead of going through the notifier
list would be more acceptable, so that we don't assume anything from the
non-KVM hooks (whether they exist or not), albeit we may assume that we
have complete information about which KVM callout has to be run for a
particular kernel version.

But again, rock-solid KVM+Xenomai is a much desirable feature.

> 
> Jan
> 
> [1]git://git.kernel.org/pub/scm/virt/kvm/kvm.git
> [2]git://git.kernel.org/pub/scm/virt/kvm/kvm-kmod.git
> [3]http://git.kiszka.org/?p=ipipe-2.6.git;a=commitdiff;h=0bddff1716aba6dd5ca11627ee377a5a25fa3dae
> 
> _______________________________________________
> Xenomai-core mailing list
> Xenomai-core@domain.hid
> https://mail.gna.org/listinfo/xenomai-core


-- 
Philippe.




^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [Xenomai-core] [RFC] KVM over Xenomai and I-pipe
  2010-04-23 11:04 ` Philippe Gerum
@ 2010-04-23 12:15   ` Jan Kiszka
  2010-04-23 14:18     ` Philippe Gerum
  0 siblings, 1 reply; 8+ messages in thread
From: Jan Kiszka @ 2010-04-23 12:15 UTC (permalink / raw)
  To: Philippe Gerum; +Cc: adeos-main, xenomai-core

[ dropping xenomai-help before going into details ]

Philippe Gerum wrote:
> On Fri, 2010-03-12 at 16:25 +0100, Jan Kiszka wrote:
>> Hi,
>>
>> this is still in the state "study", but it is working fairly nicely so far:
>>
>> These two patches harden latest KVM for use over I-pipe kernels and make
>> Xenomai aware of the lazy host state restoring that KVM uses for
>> performance reasons. The latter basically means calling the sched-out
>> notifier that KVM registers with the kernel when switching from a Linux
>> task to some shadow. This is safe in all recent versions of KVM and
>> still gives nice KVM performance (that of KVM before 2.6.32) without
>> significant impact on the RT latency (Note: if you have an old VT-x CPU,
>> guest-issued wbinvd will ruin RT as it is not intercepted by the hardware!).
>>
>> To test it, you need to apply the kernel patch on top of current kvm.git
>> master [1], obtain kvm-kmod.git [2], run configure on it (assuming your
>> host kernel is a Xenomai one, otherwise use --kerneldir) and then "make
>> sync-kmod LINUX=/path/to/kvm.git". After a final make && make install,
>> you will have recent kvm modules that are I-pipe aware. The Xenomai
>> patch simply appies to the 2.5 tree. This has been tested with
>> ipipe-2.6.32-x86-2.6-01 + [3] and Xenomai-2.5 git.
>>
>> Feedback welcome, specifically if you think it's worth integrating both
>> patches into upstream. The kernel bits would make sense over some
>> 2.6.33-x86, but additional work will be required to account for the
>> user-return notifiers introduced with that release (kvm-kmod currently
>> wraps them away for older kernels).
> 
> No concern on the final goal, running a Xenomai-enabled kernel
> rock-solid over KVM is a must.
> 
> The KVM code ironing from the 1st patch looks fine to me, no big deal to
> maintain AFAICS. I would be only concerned by the 2nd patch,
> specifically how the KVM callout is invoked from the Xenomai context
> switching code:
> 
> - depending on CONFIG_PREEMPT_NOTIFIERS is much broader than required; I
> guess that CONFIG_KVM would be enough.

So far, only CONFIG_KVM enables CONFIG_PREEMPT_NOTIFIERS. Granted, this
could change in the future. But letting our invocation depend on
CONFIG_KVM would not automatically remove the need to review those new
notifiers (BTW, there would be a fairly high probability that those will
be of some use for Xenomai as well).

> 
> - calling the KVM callout directly instead of going through the notifier
> list would be more acceptable, so that we don't assume anything from the
> non-KVM hooks (whether they exist or not), albeit we may assume that we
> have complete information about which KVM callout has to be run for a
> particular kernel version.

Possible, but hacky. We would have to

- export the callback from the KVM module
  (this will also mean the nucleus will depend on CONFIG_KVM if the
  latter is on)
- somehow get hold of the notifier entry (I have no clue how as they are
  per-vcpu)
- invoke the callback directly, passing that notifier entry

or

- identify the KVM callback in the notifier chain and only call that one
  when walking the list

The latter could be achieved by somehow tagging KVM notifiers in order
to find them when walking the chain. Still quite some patching, and I'm
not yet sure it's worth the safety gain.

Jan

-- 
Siemens AG, Corporate Technology, CT T DE IT 1
Corporate Competence Center Embedded Linux


^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [Xenomai-core] [RFC] KVM over Xenomai and I-pipe
  2010-04-23 12:15   ` Jan Kiszka
@ 2010-04-23 14:18     ` Philippe Gerum
  2010-04-23 14:30       ` [Xenomai-core] [Adeos-main] " Philippe Gerum
  0 siblings, 1 reply; 8+ messages in thread
From: Philippe Gerum @ 2010-04-23 14:18 UTC (permalink / raw)
  To: Jan Kiszka; +Cc: adeos-main, xenomai-core

On Fri, 2010-04-23 at 14:15 +0200, Jan Kiszka wrote:
> [ dropping xenomai-help before going into details ]
> 
> Philippe Gerum wrote:
> > On Fri, 2010-03-12 at 16:25 +0100, Jan Kiszka wrote:
> >> Hi,
> >>
> >> this is still in the state "study", but it is working fairly nicely so far:
> >>
> >> These two patches harden latest KVM for use over I-pipe kernels and make
> >> Xenomai aware of the lazy host state restoring that KVM uses for
> >> performance reasons. The latter basically means calling the sched-out
> >> notifier that KVM registers with the kernel when switching from a Linux
> >> task to some shadow. This is safe in all recent versions of KVM and
> >> still gives nice KVM performance (that of KVM before 2.6.32) without
> >> significant impact on the RT latency (Note: if you have an old VT-x CPU,
> >> guest-issued wbinvd will ruin RT as it is not intercepted by the hardware!).
> >>
> >> To test it, you need to apply the kernel patch on top of current kvm.git
> >> master [1], obtain kvm-kmod.git [2], run configure on it (assuming your
> >> host kernel is a Xenomai one, otherwise use --kerneldir) and then "make
> >> sync-kmod LINUX=/path/to/kvm.git". After a final make && make install,
> >> you will have recent kvm modules that are I-pipe aware. The Xenomai
> >> patch simply appies to the 2.5 tree. This has been tested with
> >> ipipe-2.6.32-x86-2.6-01 + [3] and Xenomai-2.5 git.
> >>
> >> Feedback welcome, specifically if you think it's worth integrating both
> >> patches into upstream. The kernel bits would make sense over some
> >> 2.6.33-x86, but additional work will be required to account for the
> >> user-return notifiers introduced with that release (kvm-kmod currently
> >> wraps them away for older kernels).
> > 
> > No concern on the final goal, running a Xenomai-enabled kernel
> > rock-solid over KVM is a must.
> > 
> > The KVM code ironing from the 1st patch looks fine to me, no big deal to
> > maintain AFAICS. I would be only concerned by the 2nd patch,
> > specifically how the KVM callout is invoked from the Xenomai context
> > switching code:
> > 
> > - depending on CONFIG_PREEMPT_NOTIFIERS is much broader than required; I
> > guess that CONFIG_KVM would be enough.
> 
> So far, only CONFIG_KVM enables CONFIG_PREEMPT_NOTIFIERS. Granted, this
> could change in the future. But letting our invocation depend on
> CONFIG_KVM would not automatically remove the need to review those new
> notifiers (BTW, there would be a fairly high probability that those will
> be of some use for Xenomai as well).
> 
> > 
> > - calling the KVM callout directly instead of going through the notifier
> > list would be more acceptable, so that we don't assume anything from the
> > non-KVM hooks (whether they exist or not), albeit we may assume that we
> > have complete information about which KVM callout has to be run for a
> > particular kernel version.
> 
> Possible, but hacky. We would have to
> 
> - export the callback from the KVM module
>   (this will also mean the nucleus will depend on CONFIG_KVM if the
>   latter is on)

Which is already the case for a number of knobs anyway (particularly on
x86*).

> - somehow get hold of the notifier entry (I have no clue how as they are
>   per-vcpu)
> - invoke the callback directly, passing that notifier entry
> 

This is what I had in mind in my post.

> or
> 
> - identify the KVM callback in the notifier chain and only call that one
>   when walking the list

I don't see any upside to this yet. If this is about context preparation
that would be done by the notification system, then we'd better off
mimicking it, instead of introducing kludges to reuse it.

> 
> The latter could be achieved by somehow tagging KVM notifiers in order
> to find them when walking the chain. Still quite some patching, and I'm
> not yet sure it's worth the safety gain.

The point is that we shall check whether our coupling to the KVM system
is correct, for each kernel version we want to support anyway. This
means that some preparation work has to be done, whether it is by
inspecting the possibly NMI-unsafe notifier hooks or the interface rules
to the KVM hook is not the most important thing here.

If you definition of "hacky" here means "ad hoc", in any case, any
implementation you could find would be hacky, because Xenomai introduces
a context switching spot in a kernel that does not expect it, and as
such, we do bypass the normal paths for this. Therefore, I see no way to
do this without exactly knowing the kernel/KVM context, on a per-release
basis.

> 
> Jan
> 


-- 
Philippe.




^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [Xenomai-core] [Adeos-main]  [RFC] KVM over Xenomai and I-pipe
  2010-04-23 14:18     ` Philippe Gerum
@ 2010-04-23 14:30       ` Philippe Gerum
  2010-04-23 18:22         ` Jan Kiszka
  0 siblings, 1 reply; 8+ messages in thread
From: Philippe Gerum @ 2010-04-23 14:30 UTC (permalink / raw)
  To: Jan Kiszka; +Cc: adeos-main, xenomai-core

On Fri, 2010-04-23 at 16:18 +0200, Philippe Gerum wrote:
> On Fri, 2010-04-23 at 14:15 +0200, Jan Kiszka wrote:
> > [ dropping xenomai-help before going into details ]
> > 
> > Philippe Gerum wrote:
> > > On Fri, 2010-03-12 at 16:25 +0100, Jan Kiszka wrote:
> > >> Hi,
> > >>
> > >> this is still in the state "study", but it is working fairly nicely so far:
> > >>
> > >> These two patches harden latest KVM for use over I-pipe kernels and make
> > >> Xenomai aware of the lazy host state restoring that KVM uses for
> > >> performance reasons. The latter basically means calling the sched-out
> > >> notifier that KVM registers with the kernel when switching from a Linux
> > >> task to some shadow. This is safe in all recent versions of KVM and
> > >> still gives nice KVM performance (that of KVM before 2.6.32) without
> > >> significant impact on the RT latency (Note: if you have an old VT-x CPU,
> > >> guest-issued wbinvd will ruin RT as it is not intercepted by the hardware!).
> > >>
> > >> To test it, you need to apply the kernel patch on top of current kvm.git
> > >> master [1], obtain kvm-kmod.git [2], run configure on it (assuming your
> > >> host kernel is a Xenomai one, otherwise use --kerneldir) and then "make
> > >> sync-kmod LINUX=/path/to/kvm.git". After a final make && make install,
> > >> you will have recent kvm modules that are I-pipe aware. The Xenomai
> > >> patch simply appies to the 2.5 tree. This has been tested with
> > >> ipipe-2.6.32-x86-2.6-01 + [3] and Xenomai-2.5 git.
> > >>
> > >> Feedback welcome, specifically if you think it's worth integrating both
> > >> patches into upstream. The kernel bits would make sense over some
> > >> 2.6.33-x86, but additional work will be required to account for the
> > >> user-return notifiers introduced with that release (kvm-kmod currently
> > >> wraps them away for older kernels).
> > > 
> > > No concern on the final goal, running a Xenomai-enabled kernel
> > > rock-solid over KVM is a must.
> > > 
> > > The KVM code ironing from the 1st patch looks fine to me, no big deal to
> > > maintain AFAICS. I would be only concerned by the 2nd patch,
> > > specifically how the KVM callout is invoked from the Xenomai context
> > > switching code:
> > > 
> > > - depending on CONFIG_PREEMPT_NOTIFIERS is much broader than required; I
> > > guess that CONFIG_KVM would be enough.
> > 
> > So far, only CONFIG_KVM enables CONFIG_PREEMPT_NOTIFIERS. Granted, this
> > could change in the future. But letting our invocation depend on
> > CONFIG_KVM would not automatically remove the need to review those new
> > notifiers (BTW, there would be a fairly high probability that those will
> > be of some use for Xenomai as well).
> > 
> > > 
> > > - calling the KVM callout directly instead of going through the notifier
> > > list would be more acceptable, so that we don't assume anything from the
> > > non-KVM hooks (whether they exist or not), albeit we may assume that we
> > > have complete information about which KVM callout has to be run for a
> > > particular kernel version.
> > 
> > Possible, but hacky. We would have to
> > 
> > - export the callback from the KVM module
> >   (this will also mean the nucleus will depend on CONFIG_KVM if the
> >   latter is on)
> 
> Which is already the case for a number of knobs anyway (particularly on
> x86*).
> 
> > - somehow get hold of the notifier entry (I have no clue how as they are
> >   per-vcpu)
> > - invoke the callback directly, passing that notifier entry
> > 
> 
> This is what I had in mind in my post.

Sorry, wrong read: what I had in mind, was simply to identify the KVM
hook within the code, and forge a correct call interface, whatever this
means (i.e. with the original notifier entry, or by providing a second
hook entry point which would not require such notifier entry).

> 
> > or
> > 
> > - identify the KVM callback in the notifier chain and only call that one
> >   when walking the list
> 
> I don't see any upside to this yet. If this is about context preparation
> that would be done by the notification system, then we'd better off
> mimicking it, instead of introducing kludges to reuse it.
> 
> > 
> > The latter could be achieved by somehow tagging KVM notifiers in order
> > to find them when walking the chain. Still quite some patching, and I'm
> > not yet sure it's worth the safety gain.
> 
> The point is that we shall check whether our coupling to the KVM system
> is correct, for each kernel version we want to support anyway. This
> means that some preparation work has to be done, whether it is by
> inspecting the possibly NMI-unsafe notifier hooks or the interface rules
> to the KVM hook is not the most important thing here.
> 
> If you definition of "hacky" here means "ad hoc", in any case, any
> implementation you could find would be hacky, because Xenomai introduces
> a context switching spot in a kernel that does not expect it, and as
> such, we do bypass the normal paths for this. Therefore, I see no way to
> do this without exactly knowing the kernel/KVM context, on a per-release
> basis.
> 
> > 
> > Jan
> > 
> 
> 


-- 
Philippe.




^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [Xenomai-core] [Adeos-main]  [RFC] KVM over Xenomai and I-pipe
  2010-04-23 14:30       ` [Xenomai-core] [Adeos-main] " Philippe Gerum
@ 2010-04-23 18:22         ` Jan Kiszka
  2010-04-23 21:38           ` Philippe Gerum
  0 siblings, 1 reply; 8+ messages in thread
From: Jan Kiszka @ 2010-04-23 18:22 UTC (permalink / raw)
  To: Philippe Gerum; +Cc: adeos-main, xenomai-core

[-- Attachment #1: Type: text/plain, Size: 6211 bytes --]

Philippe Gerum wrote:
> On Fri, 2010-04-23 at 16:18 +0200, Philippe Gerum wrote:
>> On Fri, 2010-04-23 at 14:15 +0200, Jan Kiszka wrote:
>>> [ dropping xenomai-help before going into details ]
>>>
>>> Philippe Gerum wrote:
>>>> On Fri, 2010-03-12 at 16:25 +0100, Jan Kiszka wrote:
>>>>> Hi,
>>>>>
>>>>> this is still in the state "study", but it is working fairly nicely so far:
>>>>>
>>>>> These two patches harden latest KVM for use over I-pipe kernels and make
>>>>> Xenomai aware of the lazy host state restoring that KVM uses for
>>>>> performance reasons. The latter basically means calling the sched-out
>>>>> notifier that KVM registers with the kernel when switching from a Linux
>>>>> task to some shadow. This is safe in all recent versions of KVM and
>>>>> still gives nice KVM performance (that of KVM before 2.6.32) without
>>>>> significant impact on the RT latency (Note: if you have an old VT-x CPU,
>>>>> guest-issued wbinvd will ruin RT as it is not intercepted by the hardware!).
>>>>>
>>>>> To test it, you need to apply the kernel patch on top of current kvm.git
>>>>> master [1], obtain kvm-kmod.git [2], run configure on it (assuming your
>>>>> host kernel is a Xenomai one, otherwise use --kerneldir) and then "make
>>>>> sync-kmod LINUX=/path/to/kvm.git". After a final make && make install,
>>>>> you will have recent kvm modules that are I-pipe aware. The Xenomai
>>>>> patch simply appies to the 2.5 tree. This has been tested with
>>>>> ipipe-2.6.32-x86-2.6-01 + [3] and Xenomai-2.5 git.
>>>>>
>>>>> Feedback welcome, specifically if you think it's worth integrating both
>>>>> patches into upstream. The kernel bits would make sense over some
>>>>> 2.6.33-x86, but additional work will be required to account for the
>>>>> user-return notifiers introduced with that release (kvm-kmod currently
>>>>> wraps them away for older kernels).
>>>> No concern on the final goal, running a Xenomai-enabled kernel
>>>> rock-solid over KVM is a must.
>>>>
>>>> The KVM code ironing from the 1st patch looks fine to me, no big deal to
>>>> maintain AFAICS. I would be only concerned by the 2nd patch,
>>>> specifically how the KVM callout is invoked from the Xenomai context
>>>> switching code:
>>>>
>>>> - depending on CONFIG_PREEMPT_NOTIFIERS is much broader than required; I
>>>> guess that CONFIG_KVM would be enough.
>>> So far, only CONFIG_KVM enables CONFIG_PREEMPT_NOTIFIERS. Granted, this
>>> could change in the future. But letting our invocation depend on
>>> CONFIG_KVM would not automatically remove the need to review those new
>>> notifiers (BTW, there would be a fairly high probability that those will
>>> be of some use for Xenomai as well).
>>>
>>>> - calling the KVM callout directly instead of going through the notifier
>>>> list would be more acceptable, so that we don't assume anything from the
>>>> non-KVM hooks (whether they exist or not), albeit we may assume that we
>>>> have complete information about which KVM callout has to be run for a
>>>> particular kernel version.
>>> Possible, but hacky. We would have to
>>>
>>> - export the callback from the KVM module
>>>   (this will also mean the nucleus will depend on CONFIG_KVM if the
>>>   latter is on)
>> Which is already the case for a number of knobs anyway (particularly on
>> x86*).

The difference is that kvm can be configured as _module_. Simply
exporting won't be enough.

>>
>>> - somehow get hold of the notifier entry (I have no clue how as they are
>>>   per-vcpu)
>>> - invoke the callback directly, passing that notifier entry
>>>
>> This is what I had in mind in my post.
> 
> Sorry, wrong read: what I had in mind, was simply to identify the KVM
> hook within the code, and forge a correct call interface, whatever this
> means (i.e. with the original notifier entry, or by providing a second
> hook entry point which would not require such notifier entry).

As KVM registers dynamically with the notifier chain (when the
corresponding VCPU is scheduled in an out), getting the right context is
tricky unless you reuse the notifier chain or let I-pipe provide another
callback interface.

> 
>>> or
>>>
>>> - identify the KVM callback in the notifier chain and only call that one
>>>   when walking the list
>> I don't see any upside to this yet. If this is about context preparation
>> that would be done by the notification system, then we'd better off
>> mimicking it, instead of introducing kludges to reuse it.

Mimicking will mean (almost) 1:1 copying.

>>
>>> The latter could be achieved by somehow tagging KVM notifiers in order
>>> to find them when walking the chain. Still quite some patching, and I'm
>>> not yet sure it's worth the safety gain.
>> The point is that we shall check whether our coupling to the KVM system
>> is correct, for each kernel version we want to support anyway. This
>> means that some preparation work has to be done, whether it is by
>> inspecting the possibly NMI-unsafe notifier hooks or the interface rules
>> to the KVM hook is not the most important thing here.
>>
>> If you definition of "hacky" here means "ad hoc", in any case, any
>> implementation you could find would be hacky, because Xenomai introduces
>> a context switching spot in a kernel that does not expect it, and as
>> such, we do bypass the normal paths for this. Therefore, I see no way to
>> do this without exactly knowing the kernel/KVM context, on a per-release
>> basis.

Right, that's what we already have to know in order to reuse e.g.
switch_mm safely. The preempt notifier plays in the same league as they
are there to inform subsystems about such kind of switches.

So we have two basic options:
 - patch KVM to additionally register callbacks with I-pipe
   (ipipe_preempt_notifiers)
 - reuse the existing sched_out notifier, keeping an eye on potential
   new users (they exist since 2.6.23 - without anyone else showing
   interest so far)

In both case, we will pull some tricky parts of KVM on our review list,
that's unavoidable. But as long as we reuse well-established interfaces
for this, I'm not too concerned about this.

Jan


[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 257 bytes --]

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [Xenomai-core] [Adeos-main]  [RFC] KVM over Xenomai and I-pipe
  2010-04-23 18:22         ` Jan Kiszka
@ 2010-04-23 21:38           ` Philippe Gerum
  2010-04-26 16:23             ` Jan Kiszka
  0 siblings, 1 reply; 8+ messages in thread
From: Philippe Gerum @ 2010-04-23 21:38 UTC (permalink / raw)
  To: Jan Kiszka; +Cc: adeos-main, xenomai-core

On Fri, 2010-04-23 at 20:22 +0200, Jan Kiszka wrote:
> Philippe Gerum wrote:
> > On Fri, 2010-04-23 at 16:18 +0200, Philippe Gerum wrote:
> >> On Fri, 2010-04-23 at 14:15 +0200, Jan Kiszka wrote:
> >>> [ dropping xenomai-help before going into details ]
> >>>
> >>> Philippe Gerum wrote:
> >>>> On Fri, 2010-03-12 at 16:25 +0100, Jan Kiszka wrote:
> >>>>> Hi,
> >>>>>
> >>>>> this is still in the state "study", but it is working fairly nicely so far:
> >>>>>
> >>>>> These two patches harden latest KVM for use over I-pipe kernels and make
> >>>>> Xenomai aware of the lazy host state restoring that KVM uses for
> >>>>> performance reasons. The latter basically means calling the sched-out
> >>>>> notifier that KVM registers with the kernel when switching from a Linux
> >>>>> task to some shadow. This is safe in all recent versions of KVM and
> >>>>> still gives nice KVM performance (that of KVM before 2.6.32) without
> >>>>> significant impact on the RT latency (Note: if you have an old VT-x CPU,
> >>>>> guest-issued wbinvd will ruin RT as it is not intercepted by the hardware!).
> >>>>>
> >>>>> To test it, you need to apply the kernel patch on top of current kvm.git
> >>>>> master [1], obtain kvm-kmod.git [2], run configure on it (assuming your
> >>>>> host kernel is a Xenomai one, otherwise use --kerneldir) and then "make
> >>>>> sync-kmod LINUX=/path/to/kvm.git". After a final make && make install,
> >>>>> you will have recent kvm modules that are I-pipe aware. The Xenomai
> >>>>> patch simply appies to the 2.5 tree. This has been tested with
> >>>>> ipipe-2.6.32-x86-2.6-01 + [3] and Xenomai-2.5 git.
> >>>>>
> >>>>> Feedback welcome, specifically if you think it's worth integrating both
> >>>>> patches into upstream. The kernel bits would make sense over some
> >>>>> 2.6.33-x86, but additional work will be required to account for the
> >>>>> user-return notifiers introduced with that release (kvm-kmod currently
> >>>>> wraps them away for older kernels).
> >>>> No concern on the final goal, running a Xenomai-enabled kernel
> >>>> rock-solid over KVM is a must.
> >>>>
> >>>> The KVM code ironing from the 1st patch looks fine to me, no big deal to
> >>>> maintain AFAICS. I would be only concerned by the 2nd patch,
> >>>> specifically how the KVM callout is invoked from the Xenomai context
> >>>> switching code:
> >>>>
> >>>> - depending on CONFIG_PREEMPT_NOTIFIERS is much broader than required; I
> >>>> guess that CONFIG_KVM would be enough.
> >>> So far, only CONFIG_KVM enables CONFIG_PREEMPT_NOTIFIERS. Granted, this
> >>> could change in the future. But letting our invocation depend on
> >>> CONFIG_KVM would not automatically remove the need to review those new
> >>> notifiers (BTW, there would be a fairly high probability that those will
> >>> be of some use for Xenomai as well).
> >>>
> >>>> - calling the KVM callout directly instead of going through the notifier
> >>>> list would be more acceptable, so that we don't assume anything from the
> >>>> non-KVM hooks (whether they exist or not), albeit we may assume that we
> >>>> have complete information about which KVM callout has to be run for a
> >>>> particular kernel version.
> >>> Possible, but hacky. We would have to
> >>>
> >>> - export the callback from the KVM module
> >>>   (this will also mean the nucleus will depend on CONFIG_KVM if the
> >>>   latter is on)
> >> Which is already the case for a number of knobs anyway (particularly on
> >> x86*).
> 
> The difference is that kvm can be configured as _module_. Simply
> exporting won't be enough.
> 

Quite frankly, I see no showstopper in forcing a statically built KVM
whenever Xenomai is enabled, provided we do that onmy when say,
CONFIG_XENOMAI_VMCLIENT is switched on. Would you see a significant
feature loss in removing modular support for KVM in this context?

> >>
> >>> - somehow get hold of the notifier entry (I have no clue how as they are
> >>>   per-vcpu)
> >>> - invoke the callback directly, passing that notifier entry
> >>>
> >> This is what I had in mind in my post.
> > 
> > Sorry, wrong read: what I had in mind, was simply to identify the KVM
> > hook within the code, and forge a correct call interface, whatever this
> > means (i.e. with the original notifier entry, or by providing a second
> > hook entry point which would not require such notifier entry).
> 
> As KVM registers dynamically with the notifier chain (when the
> corresponding VCPU is scheduled in an out), getting the right context is
> tricky unless you reuse the notifier chain or let I-pipe provide another
> callback interface.
> 
> > 
> >>> or
> >>>
> >>> - identify the KVM callback in the notifier chain and only call that one
> >>>   when walking the list
> >> I don't see any upside to this yet. If this is about context preparation
> >> that would be done by the notification system, then we'd better off
> >> mimicking it, instead of introducing kludges to reuse it.
> 
> Mimicking will mean (almost) 1:1 copying.

Yes, for sure. That's the price to pay, I guess, like for anything we
must reuse from the innards.

> 
> >>
> >>> The latter could be achieved by somehow tagging KVM notifiers in order
> >>> to find them when walking the chain. Still quite some patching, and I'm
> >>> not yet sure it's worth the safety gain.
> >> The point is that we shall check whether our coupling to the KVM system
> >> is correct, for each kernel version we want to support anyway. This
> >> means that some preparation work has to be done, whether it is by
> >> inspecting the possibly NMI-unsafe notifier hooks or the interface rules
> >> to the KVM hook is not the most important thing here.
> >>
> >> If you definition of "hacky" here means "ad hoc", in any case, any
> >> implementation you could find would be hacky, because Xenomai introduces
> >> a context switching spot in a kernel that does not expect it, and as
> >> such, we do bypass the normal paths for this. Therefore, I see no way to
> >> do this without exactly knowing the kernel/KVM context, on a per-release
> >> basis.
> 
> Right, that's what we already have to know in order to reuse e.g.
> switch_mm safely. The preempt notifier plays in the same league as they
> are there to inform subsystems about such kind of switches.
> 
> So we have two basic options:
>  - patch KVM to additionally register callbacks with I-pipe
>    (ipipe_preempt_notifiers)
>  - reuse the existing sched_out notifier, keeping an eye on potential
>    new users (they exist since 2.6.23 - without anyone else showing
>    interest so far)
> 
> In both case, we will pull some tricky parts of KVM on our review list,
> that's unavoidable. But as long as we reuse well-established interfaces
> for this, I'm not too concerned about this.
> 

Ack.

> Jan
> 


-- 
Philippe.




^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: [Xenomai-core] [Adeos-main]  [RFC] KVM over Xenomai and I-pipe
  2010-04-23 21:38           ` Philippe Gerum
@ 2010-04-26 16:23             ` Jan Kiszka
  0 siblings, 0 replies; 8+ messages in thread
From: Jan Kiszka @ 2010-04-26 16:23 UTC (permalink / raw)
  To: Philippe Gerum; +Cc: adeos-main, xenomai-core

Philippe Gerum wrote:
> On Fri, 2010-04-23 at 20:22 +0200, Jan Kiszka wrote:
>> Philippe Gerum wrote:
>>> On Fri, 2010-04-23 at 16:18 +0200, Philippe Gerum wrote:
>>>> On Fri, 2010-04-23 at 14:15 +0200, Jan Kiszka wrote:
>>>>> [ dropping xenomai-help before going into details ]
>>>>>
>>>>> Philippe Gerum wrote:
>>>>>> On Fri, 2010-03-12 at 16:25 +0100, Jan Kiszka wrote:
>>>>>>> Hi,
>>>>>>>
>>>>>>> this is still in the state "study", but it is working fairly nicely so far:
>>>>>>>
>>>>>>> These two patches harden latest KVM for use over I-pipe kernels and make
>>>>>>> Xenomai aware of the lazy host state restoring that KVM uses for
>>>>>>> performance reasons. The latter basically means calling the sched-out
>>>>>>> notifier that KVM registers with the kernel when switching from a Linux
>>>>>>> task to some shadow. This is safe in all recent versions of KVM and
>>>>>>> still gives nice KVM performance (that of KVM before 2.6.32) without
>>>>>>> significant impact on the RT latency (Note: if you have an old VT-x CPU,
>>>>>>> guest-issued wbinvd will ruin RT as it is not intercepted by the hardware!).
>>>>>>>
>>>>>>> To test it, you need to apply the kernel patch on top of current kvm.git
>>>>>>> master [1], obtain kvm-kmod.git [2], run configure on it (assuming your
>>>>>>> host kernel is a Xenomai one, otherwise use --kerneldir) and then "make
>>>>>>> sync-kmod LINUX=/path/to/kvm.git". After a final make && make install,
>>>>>>> you will have recent kvm modules that are I-pipe aware. The Xenomai
>>>>>>> patch simply appies to the 2.5 tree. This has been tested with
>>>>>>> ipipe-2.6.32-x86-2.6-01 + [3] and Xenomai-2.5 git.
>>>>>>>
>>>>>>> Feedback welcome, specifically if you think it's worth integrating both
>>>>>>> patches into upstream. The kernel bits would make sense over some
>>>>>>> 2.6.33-x86, but additional work will be required to account for the
>>>>>>> user-return notifiers introduced with that release (kvm-kmod currently
>>>>>>> wraps them away for older kernels).
>>>>>> No concern on the final goal, running a Xenomai-enabled kernel
>>>>>> rock-solid over KVM is a must.
>>>>>>
>>>>>> The KVM code ironing from the 1st patch looks fine to me, no big deal to
>>>>>> maintain AFAICS. I would be only concerned by the 2nd patch,
>>>>>> specifically how the KVM callout is invoked from the Xenomai context
>>>>>> switching code:
>>>>>>
>>>>>> - depending on CONFIG_PREEMPT_NOTIFIERS is much broader than required; I
>>>>>> guess that CONFIG_KVM would be enough.
>>>>> So far, only CONFIG_KVM enables CONFIG_PREEMPT_NOTIFIERS. Granted, this
>>>>> could change in the future. But letting our invocation depend on
>>>>> CONFIG_KVM would not automatically remove the need to review those new
>>>>> notifiers (BTW, there would be a fairly high probability that those will
>>>>> be of some use for Xenomai as well).
>>>>>
>>>>>> - calling the KVM callout directly instead of going through the notifier
>>>>>> list would be more acceptable, so that we don't assume anything from the
>>>>>> non-KVM hooks (whether they exist or not), albeit we may assume that we
>>>>>> have complete information about which KVM callout has to be run for a
>>>>>> particular kernel version.
>>>>> Possible, but hacky. We would have to
>>>>>
>>>>> - export the callback from the KVM module
>>>>>   (this will also mean the nucleus will depend on CONFIG_KVM if the
>>>>>   latter is on)
>>>> Which is already the case for a number of knobs anyway (particularly on
>>>> x86*).
>> The difference is that kvm can be configured as _module_. Simply
>> exporting won't be enough.
>>
> 
> Quite frankly, I see no showstopper in forcing a statically built KVM
> whenever Xenomai is enabled, provided we do that onmy when say,
> CONFIG_XENOMAI_VMCLIENT is switched on. Would you see a significant
> feature loss in removing modular support for KVM in this context?

I finally realized that, due to the dynamic nature of KVM's
registration, this would not work at all. We need a callback that KVM
actively registers (along with the right cookie).

Moreover, it makes sense to do the deregistration atomically along
voluntary sched_out calls of KVM. So establishing some tiny
ipipe_preempt_notifier has its benefits (or is even mandatory). Will
hack up something like that for 2.6.34 and come back once it works.

Jan

-- 
Siemens AG, Corporate Technology, CT T DE IT 1
Corporate Competence Center Embedded Linux


^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2010-04-26 16:23 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2010-03-12 15:25 [Xenomai-core] [RFC] KVM over Xenomai and I-pipe Jan Kiszka
2010-04-23 11:04 ` Philippe Gerum
2010-04-23 12:15   ` Jan Kiszka
2010-04-23 14:18     ` Philippe Gerum
2010-04-23 14:30       ` [Xenomai-core] [Adeos-main] " Philippe Gerum
2010-04-23 18:22         ` Jan Kiszka
2010-04-23 21:38           ` Philippe Gerum
2010-04-26 16:23             ` Jan Kiszka

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.