linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH] kvm: log directly from the guest to the host kvm buffer
@ 2011-05-12 13:36 Dhaval Giani
  2011-05-12 15:13 ` Avi Kivity
  0 siblings, 1 reply; 6+ messages in thread
From: Dhaval Giani @ 2011-05-12 13:36 UTC (permalink / raw)
  To: kvm, joro, agraf, rostedt; +Cc: linux-kernel

[-- Attachment #1: Type: text/plain, Size: 5004 bytes --]

Hi,

As part of some of the work for my project, I have been looking at
tracing some of the events in the guest from inside the host. In
my usecase, I have been looking to co-relate the time of a network
packet arrival with that in the host. ftrace makes such arbitrary
use quite simple, so I went ahead an extended this functionality
in terms of a hypercall. There are still a few issues with this patch.

1. For some reason, the first time the hypercall is called, it works
just fine, but the second invocation refuses to happen. I am still
clueless about it. (and am looking for hints :-) )
2. I am not very sure if I got the demarcation between the guest and
the host code fine or not. Someone more experienced than me should take
a look at the code as well :-)
3. This adds a new paravirt call.
4. This has been implemented just for x86 as of now. If there is enough
interest, I will look to make it more generic to be used across other
architectures. However, it is quite easy to do the same.
5. It still does not have all the fancy ftrace features, but again,
depending on the interest, I can add all those in.
6. Create a config option for this feature.

I think such a feature is useful for debugging purposes and might make
sense to carry upstream.

Thanks,
Dhaval

---
kvm: log directly from the guest to the host kvm buffer

Add a new hypercall kvm_pv_ftrace() which logs in the host's
ftrace buffer. To use, the caller should use the kvm_ftrace
macro in the guest. This is still very early code and does not fully
work.

Signed-off-by: Dhaval Giani <dhaval.giani@gmail.com>

---
 arch/x86/include/asm/kvm_para.h |   11 +++++++++++
 arch/x86/kernel/kvm.c           |   22 ++++++++++++++++++++++
 arch/x86/kvm/x86.c              |   18 ++++++++++++++++++
 include/linux/kvm_para.h        |    1 +
 4 files changed, 52 insertions(+)

Index: linux-2.6/arch/x86/kvm/x86.c
===================================================================
--- linux-2.6.orig/arch/x86/kvm/x86.c
+++ linux-2.6/arch/x86/kvm/x86.c
@@ -4832,6 +4832,21 @@ int kvm_hv_hypercall(struct kvm_vcpu *vc
 	return 1;
 }
 
+int kvm_pv_ftrace(struct kvm_vcpu *vcpu, unsigned long ip, gpa_t addr)
+{
+	int ret;
+	char *fmt = (char *) kzalloc(PAGE_SIZE, GFP_KERNEL);
+
+	ret = kvm_read_guest(vcpu->kvm, addr, fmt, PAGE_SIZE);
+
+	trace_printk("KVM instance %p: VCPU %d, IP %lu: %s",
+				vcpu->kvm, vcpu->vcpu_id, ip, fmt);
+
+	kfree(fmt);
+
+	return 0;
+}
+
 int kvm_emulate_hypercall(struct kvm_vcpu *vcpu)
 {
 	unsigned long nr, a0, a1, a2, a3, ret;
@@ -4868,6 +4883,9 @@ int kvm_emulate_hypercall(struct kvm_vcp
 	case KVM_HC_MMU_OP:
 		r = kvm_pv_mmu_op(vcpu, a0, hc_gpa(vcpu, a1, a2), &ret);
 		break;
+	case KVM_HC_FTRACE:
+		ret = kvm_pv_ftrace_printk(vcpu, a0, hc_gpa(vcpu, a1, a2));
+		break;
 	default:
 		ret = -KVM_ENOSYS;
 		break;
Index: linux-2.6/include/linux/kvm_para.h
===================================================================
--- linux-2.6.orig/include/linux/kvm_para.h
+++ linux-2.6/include/linux/kvm_para.h
@@ -19,6 +19,7 @@
 #define KVM_HC_MMU_OP			2
 #define KVM_HC_FEATURES			3
 #define KVM_HC_PPC_MAP_MAGIC_PAGE	4
+#define KVM_HC_FTRACE			5
 
 /*
  * hypercalls use architecture specific
Index: linux-2.6/arch/x86/kernel/kvm.c
===================================================================
--- linux-2.6.orig/arch/x86/kernel/kvm.c
+++ linux-2.6/arch/x86/kernel/kvm.c
@@ -274,6 +274,28 @@ static void kvm_mmu_op(void *buffer, uns
 	} while (len);
 }
 
+int kvm_ftrace_printk(unsigned long ip, const char *fmt, ...)
+{
+	char *buffer = kzalloc(PAGE_SIZE, GFP_KERNEL);
+	int ret;
+	unsigned long a1, a2;
+	va_list args;
+	int i;
+
+	va_start(args, fmt);
+	i = vsnprintf(buffer, PAGE_SIZE, fmt, args);
+	va_end(args);
+
+	a1 = __pa(buffer);
+	a2 = 0;
+
+	ret = kvm_hypercall3(KVM_HC_FTRACE, ip, a1, a2);
+
+	kfree(buffer);
+	return ret;
+}
+EXPORT_SYMBOL(kvm_ftrace_printk);
+
 static void mmu_queue_flush(struct kvm_para_state *state)
 {
 	if (state->mmu_queue_len) {
Index: linux-2.6/arch/x86/include/asm/kvm_para.h
===================================================================
--- linux-2.6.orig/arch/x86/include/asm/kvm_para.h
+++ linux-2.6/arch/x86/include/asm/kvm_para.h
@@ -21,6 +21,10 @@
  */
 #define KVM_FEATURE_CLOCKSOURCE2        3
 #define KVM_FEATURE_ASYNC_PF		4
+/*
+ * Right now an experiment, hopefully it works
+ */
+#define KVM_FEATURE_FTRACE		5
 
 /* The last 8 bits are used to indicate how to interpret the flags field
  * in pvclock structure. If no bits are set, all flags are ignored.
@@ -188,6 +192,13 @@ static inline u32 kvm_read_and_reset_pf_
 }
 #endif
 
+int kvm_ftrace_printk(unsigned long ip, const char *fmt, ...);
+
+#define kvm_ftrace(fmt, args...)				\
+	do {							\
+		kvm_ftrace_printk(_THIS_IP_, fmt, ##args);	\
+	} while(0)
+
 #endif /* __KERNEL__ */
 
 #endif /* _ASM_X86_KVM_PARA_H */


[-- Attachment #2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 836 bytes --]

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH] kvm: log directly from the guest to the host kvm buffer
  2011-05-12 13:36 [PATCH] kvm: log directly from the guest to the host kvm buffer Dhaval Giani
@ 2011-05-12 15:13 ` Avi Kivity
  2011-05-12 15:39   ` Dhaval Giani
  0 siblings, 1 reply; 6+ messages in thread
From: Avi Kivity @ 2011-05-12 15:13 UTC (permalink / raw)
  To: Dhaval Giani; +Cc: kvm, joro, agraf, rostedt, linux-kernel

On 05/12/2011 04:36 PM, Dhaval Giani wrote:
> Hi,
>
> As part of some of the work for my project, I have been looking at
> tracing some of the events in the guest from inside the host. In
> my usecase, I have been looking to co-relate the time of a network
> packet arrival with that in the host. ftrace makes such arbitrary
> use quite simple, so I went ahead an extended this functionality
> in terms of a hypercall. There are still a few issues with this patch.
>
> 1. For some reason, the first time the hypercall is called, it works
> just fine, but the second invocation refuses to happen. I am still
> clueless about it. (and am looking for hints :-) )
> 2. I am not very sure if I got the demarcation between the guest and
> the host code fine or not. Someone more experienced than me should take
> a look at the code as well :-)
> 3. This adds a new paravirt call.
> 4. This has been implemented just for x86 as of now. If there is enough
> interest, I will look to make it more generic to be used across other
> architectures. However, it is quite easy to do the same.
> 5. It still does not have all the fancy ftrace features, but again,
> depending on the interest, I can add all those in.
> 6. Create a config option for this feature.
>
> I think such a feature is useful for debugging purposes and might make
> sense to carry upstream.

I guess it could help things like virtio/vhost development and profiling.


I think that one hypercall per trace is too expensive.  Tracing is meant 
to be lightweight!  I think the guest can log to a buffer, which is 
flushed on overflow or when a vmexit occurs.  That gives us automatic 
serialization between a vcpu and the cpu it runs on, but not between a 
vcpu and a different host cpu.

>
> +int kvm_pv_ftrace(struct kvm_vcpu *vcpu, unsigned long ip, gpa_t addr)
> +{
> +	int ret;
> +	char *fmt = (char *) kzalloc(PAGE_SIZE, GFP_KERNEL);
> +
> +	ret = kvm_read_guest(vcpu->kvm, addr, fmt, PAGE_SIZE);
> +
> +	trace_printk("KVM instance %p: VCPU %d, IP %lu: %s",
> +				vcpu->kvm, vcpu->vcpu_id, ip, fmt);
> +
> +	kfree(fmt);
> +
> +	return 0;
> +}

A kmalloc and printf seem expensive here.  I'd prefer to log the 
arguments and format descriptor instead.  Similarly the guest should 
pass unformatted parameters.+int kvm_ftrace_printk(unsigned long ip, 
const char *fmt, ...)
> +{
> +	char *buffer = kzalloc(PAGE_SIZE, GFP_KERNEL);
> +	int ret;
> +	unsigned long a1, a2;
> +	va_list args;
> +	int i;
> +
> +	va_start(args, fmt);
> +	i = vsnprintf(buffer, PAGE_SIZE, fmt, args);
> +	va_end(args);
> +
> +	a1 = __pa(buffer);
> +	a2 = 0;
> +
> +	ret = kvm_hypercall3(KVM_HC_FTRACE, ip, a1, a2);
> +
> +	kfree(buffer);
> +	return ret;
> +}
> +EXPORT_SYMBOL(kvm_ftrace_printk);
> +
>   static void mmu_queue_flush(struct kvm_para_state *state)
>   {

-- 
error compiling committee.c: too many arguments to function


^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH] kvm: log directly from the guest to the host kvm buffer
  2011-05-12 15:13 ` Avi Kivity
@ 2011-05-12 15:39   ` Dhaval Giani
  2011-05-12 15:42     ` Avi Kivity
  2011-05-17 15:50     ` Steven Rostedt
  0 siblings, 2 replies; 6+ messages in thread
From: Dhaval Giani @ 2011-05-12 15:39 UTC (permalink / raw)
  To: Avi Kivity
  Cc: kvm, joro, agraf, rostedt, linux-kernel, Fabio Checconi,
	Tommaso Cucinotta

On Thu, May 12, 2011 at 5:13 PM, Avi Kivity <avi@redhat.com> wrote:
> On 05/12/2011 04:36 PM, Dhaval Giani wrote:
>>
>> Hi,
>>
>> As part of some of the work for my project, I have been looking at
>> tracing some of the events in the guest from inside the host. In
>> my usecase, I have been looking to co-relate the time of a network
>> packet arrival with that in the host. ftrace makes such arbitrary
>> use quite simple, so I went ahead an extended this functionality
>> in terms of a hypercall. There are still a few issues with this patch.
>>
>> 1. For some reason, the first time the hypercall is called, it works
>> just fine, but the second invocation refuses to happen. I am still
>> clueless about it. (and am looking for hints :-) )
>> 2. I am not very sure if I got the demarcation between the guest and
>> the host code fine or not. Someone more experienced than me should take
>> a look at the code as well :-)
>> 3. This adds a new paravirt call.
>> 4. This has been implemented just for x86 as of now. If there is enough
>> interest, I will look to make it more generic to be used across other
>> architectures. However, it is quite easy to do the same.
>> 5. It still does not have all the fancy ftrace features, but again,
>> depending on the interest, I can add all those in.
>> 6. Create a config option for this feature.
>>
>> I think such a feature is useful for debugging purposes and might make
>> sense to carry upstream.
>
> I guess it could help things like virtio/vhost development and profiling.
>

Exactly what i am using it for.

>
> I think that one hypercall per trace is too expensive.  Tracing is meant to
> be lightweight!  I think the guest can log to a buffer, which is flushed on
> overflow or when a vmexit occurs.  That gives us automatic serialization
> between a vcpu and the cpu it runs on, but not between a vcpu and a
> different host cpu.
>

hmm. So, basically, log all of these events, and then send them to the
host either on an exit, or when your buffer fills up. There is one
problem with approach though. One of the reasons I wanted this
approach was beacuse i wanted to co-relate the guest and the host
times. (which is why I kept is synchronous). I lose out that
information with what you say. However I see your point about the
overhead. I will think about this a bit more.

>>
>> +int kvm_pv_ftrace(struct kvm_vcpu *vcpu, unsigned long ip, gpa_t addr)
>> +{
>> +       int ret;
>> +       char *fmt = (char *) kzalloc(PAGE_SIZE, GFP_KERNEL);
>> +
>> +       ret = kvm_read_guest(vcpu->kvm, addr, fmt, PAGE_SIZE);
>> +
>> +       trace_printk("KVM instance %p: VCPU %d, IP %lu: %s",
>> +                               vcpu->kvm, vcpu->vcpu_id, ip, fmt);
>> +
>> +       kfree(fmt);
>> +
>> +       return 0;
>> +}
>
> A kmalloc and printf seem expensive here.  I'd prefer to log the arguments
> and format descriptor instead.  Similarly the guest should pass unformatted
> parameters.+int kvm_ftrace_printk(unsigned long ip, const char *fmt, ...)
>>

the trace_printk is actually quite cheap (IIRC), but I guess Steve is
the best person to let us know about that. We can avoid the kzalloc
overhead though.

Dhaval

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH] kvm: log directly from the guest to the host kvm buffer
  2011-05-12 15:39   ` Dhaval Giani
@ 2011-05-12 15:42     ` Avi Kivity
  2011-05-12 23:59       ` Eric Northup
  2011-05-17 15:50     ` Steven Rostedt
  1 sibling, 1 reply; 6+ messages in thread
From: Avi Kivity @ 2011-05-12 15:42 UTC (permalink / raw)
  To: Dhaval Giani
  Cc: kvm, joro, agraf, rostedt, linux-kernel, Fabio Checconi,
	Tommaso Cucinotta

On 05/12/2011 06:39 PM, Dhaval Giani wrote:
> >
> >  I think that one hypercall per trace is too expensive.  Tracing is meant to
> >  be lightweight!  I think the guest can log to a buffer, which is flushed on
> >  overflow or when a vmexit occurs.  That gives us automatic serialization
> >  between a vcpu and the cpu it runs on, but not between a vcpu and a
> >  different host cpu.
> >
>
> hmm. So, basically, log all of these events, and then send them to the
> host either on an exit, or when your buffer fills up. There is one
> problem with approach though. One of the reasons I wanted this
> approach was beacuse i wanted to co-relate the guest and the host
> times. (which is why I kept is synchronous). I lose out that
> information with what you say. However I see your point about the
> overhead. I will think about this a bit more.

You might use kvmclock to get a zero-exit (but not zero-cost) time which 
can be correlated.

Another option is to use xadd on a shared memory area to have a global 
counter incremented.  However that can be slow on large machines, and is 
hard to do securely with multiple guests.


-- 
error compiling committee.c: too many arguments to function


^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH] kvm: log directly from the guest to the host kvm buffer
  2011-05-12 15:42     ` Avi Kivity
@ 2011-05-12 23:59       ` Eric Northup
  0 siblings, 0 replies; 6+ messages in thread
From: Eric Northup @ 2011-05-12 23:59 UTC (permalink / raw)
  To: Avi Kivity
  Cc: Dhaval Giani, kvm, joro, agraf, rostedt, linux-kernel,
	Fabio Checconi, Tommaso Cucinotta

On Thu, May 12, 2011 at 8:42 AM, Avi Kivity <avi@redhat.com> wrote:
>
> On 05/12/2011 06:39 PM, Dhaval Giani wrote:
>>
>> >
>> >  I think that one hypercall per trace is too expensive.  Tracing is meant to
>> >  be lightweight!  I think the guest can log to a buffer, which is flushed on
>> >  overflow or when a vmexit occurs.  That gives us automatic serialization
>> >  between a vcpu and the cpu it runs on, but not between a vcpu and a
>> >  different host cpu.
>> >
>>
>> hmm. So, basically, log all of these events, and then send them to the
>> host either on an exit, or when your buffer fills up. There is one
>> problem with approach though. One of the reasons I wanted this
>> approach was beacuse i wanted to co-relate the guest and the host
>> times. (which is why I kept is synchronous). I lose out that
>> information with what you say. However I see your point about the
>> overhead. I will think about this a bit more.
>
> You might use kvmclock to get a zero-exit (but not zero-cost) time which can be correlated.
>
> Another option is to use xadd on a shared memory area to have a global counter incremented.  However that can be slow on large machines, and is hard to do securely with multiple guests.

If the guest puts guest TSC into the buffer with each event, KVM can
convert guest->host time when it drains the buffers on the next
vmexit.  That's enough information to do an offline correlation of
guest and host events.

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH] kvm: log directly from the guest to the host kvm buffer
  2011-05-12 15:39   ` Dhaval Giani
  2011-05-12 15:42     ` Avi Kivity
@ 2011-05-17 15:50     ` Steven Rostedt
  1 sibling, 0 replies; 6+ messages in thread
From: Steven Rostedt @ 2011-05-17 15:50 UTC (permalink / raw)
  To: Dhaval Giani
  Cc: Avi Kivity, kvm, joro, agraf, linux-kernel, Fabio Checconi,
	Tommaso Cucinotta

On Thu, 2011-05-12 at 17:39 +0200, Dhaval Giani wrote:

> >>
> >> +int kvm_pv_ftrace(struct kvm_vcpu *vcpu, unsigned long ip, gpa_t addr)
> >> +{
> >> +       int ret;
> >> +       char *fmt = (char *) kzalloc(PAGE_SIZE, GFP_KERNEL);
> >> +
> >> +       ret = kvm_read_guest(vcpu->kvm, addr, fmt, PAGE_SIZE);
> >> +
> >> +       trace_printk("KVM instance %p: VCPU %d, IP %lu: %s",
> >> +                               vcpu->kvm, vcpu->vcpu_id, ip, fmt);
> >> +
> >> +       kfree(fmt);
> >> +
> >> +       return 0;
> >> +}
> >
> > A kmalloc and printf seem expensive here.  I'd prefer to log the arguments
> > and format descriptor instead.  Similarly the guest should pass unformatted
> > parameters.+int kvm_ftrace_printk(unsigned long ip, const char *fmt, ...)
> >>
> 
> the trace_printk is actually quite cheap (IIRC), but I guess Steve is
> the best person to let us know about that. We can avoid the kzalloc
> overhead though.

You could replace the kzalloc with a static page. Perhaps, one page per
CPU. Then you just need to disable interrupts (unless this is never
called by an interrupt) to get the information into the page, without
any allocation.

trace_printk() does basically the same thing, but uses a single page
with locks. I've thought about replacing that with per cpu pages too,
but haven't gotten around to it.

I also agree that you should just write to some buffer and when it
overflows, flush it.

-- Steve



^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2011-05-17 15:50 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2011-05-12 13:36 [PATCH] kvm: log directly from the guest to the host kvm buffer Dhaval Giani
2011-05-12 15:13 ` Avi Kivity
2011-05-12 15:39   ` Dhaval Giani
2011-05-12 15:42     ` Avi Kivity
2011-05-12 23:59       ` Eric Northup
2011-05-17 15:50     ` Steven Rostedt

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).