kvm.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* ftrace/perf_event leak
@ 2010-09-01  8:42 Avi Kivity
  2010-09-01  9:04 ` Peter Zijlstra
  0 siblings, 1 reply; 13+ messages in thread
From: Avi Kivity @ 2010-09-01  8:42 UTC (permalink / raw)
  To: Peter Zijlstra, Ingo Molnar, Frederic Weisbecker, Steven Rostedt
  Cc: kvm-devel, Linux Kernel Mailing List

  I recently added perf_event support to kvm_stat, to display kvm 
tracepoints as statistics (I'd like to fold this to tools/perf 
eventually, but that's another story).  However I'm seeing a resource 
leak - after I quit the tool, there are quite a few references into the 
kvm module:

   kvm_intel              43655  0
   kvm                   272984  269 kvm_intel

The tool is just a python script that reads 
/sys/kernel/debug/tracing/events/kvm to find out which events are 
available, uses perf_event_open() to create one group per cpu to which a 
lot of events are attached.  The only special thing I can think of is 
that we use an ioctl to attach a filter to many perf_event descriptors.

You can find the source at 
http://git.kernel.org/?p=virt/kvm/qemu-kvm.git;a=blob_plain;f=kvm/kvm_stat;hb=5bd5f131b50cb373ff4e2a3632c6dad00a1f0b55.  
All it needs are the kvm modules loaded; no need to actually run a 
guest.  Run as root.

-- 
I have a truly marvellous patch that fixes the bug which this
signature is too narrow to contain.


^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: ftrace/perf_event leak
  2010-09-01  8:42 ftrace/perf_event leak Avi Kivity
@ 2010-09-01  9:04 ` Peter Zijlstra
  2010-09-01  9:26   ` Avi Kivity
  0 siblings, 1 reply; 13+ messages in thread
From: Peter Zijlstra @ 2010-09-01  9:04 UTC (permalink / raw)
  To: Avi Kivity
  Cc: Ingo Molnar, Frederic Weisbecker, Steven Rostedt, kvm-devel,
	Linux Kernel Mailing List

On Wed, 2010-09-01 at 11:42 +0300, Avi Kivity wrote:
> I recently added perf_event support to kvm_stat, to display kvm 
> tracepoints as statistics (I'd like to fold this to tools/perf 
> eventually, but that's another story).  However I'm seeing a resource 
> leak - after I quit the tool, there are quite a few references into the 
> kvm module:
> 
>    kvm_intel              43655  0
>    kvm                   272984  269 kvm_intel
> 
> The tool is just a python script that reads 
> /sys/kernel/debug/tracing/events/kvm to find out which events are 
> available, uses perf_event_open() to create one group per cpu to which a 
> lot of events are attached.  The only special thing I can think of is 
> that we use an ioctl to attach a filter to many perf_event descriptors.
> 
> You can find the source at 
> http://git.kernel.org/?p=virt/kvm/qemu-kvm.git;a=blob_plain;f=kvm/kvm_stat;hb=5bd5f131b50cb373ff4e2a3632c6dad00a1f0b55.  
> All it needs are the kvm modules loaded; no need to actually run a 
> guest.  Run as root.
> 

Does something like the below cure that?

I seem to remember C doesn't make any promises about the order of logic
statements, hence we need to explicitly pull out that try_module_get()
so that it evaluates after the rest of the conditions.

---
 kernel/trace/trace_event_perf.c |   10 ++++++----
 1 files changed, 6 insertions(+), 4 deletions(-)

diff --git a/kernel/trace/trace_event_perf.c b/kernel/trace/trace_event_perf.c
index 000e6e8..35051f2 100644
--- a/kernel/trace/trace_event_perf.c
+++ b/kernel/trace/trace_event_perf.c
@@ -88,10 +88,11 @@ int perf_trace_init(struct perf_event *p_event)
 	mutex_lock(&event_mutex);
 	list_for_each_entry(tp_event, &ftrace_events, list) {
 		if (tp_event->event.type == event_id &&
-		    tp_event->class && tp_event->class->reg &&
-		    try_module_get(tp_event->mod)) {
-			ret = perf_trace_event_init(tp_event, p_event);
-			break;
+		    tp_event->class && tp_event->class->reg) {
+		    	if (try_module_get(tp_event->mod)) {
+				ret = perf_trace_event_init(tp_event, p_event);
+				break;
+			}
 		}
 	}
 	mutex_unlock(&event_mutex);
@@ -138,6 +139,7 @@ void perf_trace_destroy(struct perf_event *p_event)
 
 	free_percpu(tp_event->perf_events);
 	tp_event->perf_events = NULL;
+	module_put(tp_event->mod);
 
 	if (!--total_ref_count) {
 		for (i = 0; i < 4; i++) {

^ permalink raw reply related	[flat|nested] 13+ messages in thread

* Re: ftrace/perf_event leak
  2010-09-01  9:04 ` Peter Zijlstra
@ 2010-09-01  9:26   ` Avi Kivity
  2010-09-01  9:35     ` Peter Zijlstra
  2010-09-01  9:38     ` Li Zefan
  0 siblings, 2 replies; 13+ messages in thread
From: Avi Kivity @ 2010-09-01  9:26 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: Ingo Molnar, Frederic Weisbecker, Steven Rostedt, kvm-devel,
	Linux Kernel Mailing List

  On 09/01/2010 12:04 PM, Peter Zijlstra wrote:
>
> Does something like the below cure that?
>

Unfortunately not.

> I seem to remember C doesn't make any promises about the order of logic
> statements, hence we need to explicitly pull out that try_module_get()
> so that it evaluates after the rest of the conditions.

&& and || are executed in order (so you can write

    if (p && p->x)

).

-- 
I have a truly marvellous patch that fixes the bug which this
signature is too narrow to contain.


^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: ftrace/perf_event leak
  2010-09-01  9:26   ` Avi Kivity
@ 2010-09-01  9:35     ` Peter Zijlstra
  2010-09-01  9:38     ` Li Zefan
  1 sibling, 0 replies; 13+ messages in thread
From: Peter Zijlstra @ 2010-09-01  9:35 UTC (permalink / raw)
  To: Avi Kivity
  Cc: Ingo Molnar, Frederic Weisbecker, Steven Rostedt, kvm-devel,
	Linux Kernel Mailing List

On Wed, 2010-09-01 at 12:26 +0300, Avi Kivity wrote:
> On 09/01/2010 12:04 PM, Peter Zijlstra wrote:
> >
> > Does something like the below cure that?
> >
> 
> Unfortunately not.

Hrmm,.. bugger, don't use modules is my motto.. still I'll have another
look.

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: ftrace/perf_event leak
  2010-09-01  9:26   ` Avi Kivity
  2010-09-01  9:35     ` Peter Zijlstra
@ 2010-09-01  9:38     ` Li Zefan
  2010-09-01  9:41       ` Peter Zijlstra
  2010-09-01 10:38       ` Avi Kivity
  1 sibling, 2 replies; 13+ messages in thread
From: Li Zefan @ 2010-09-01  9:38 UTC (permalink / raw)
  To: Avi Kivity
  Cc: Peter Zijlstra, Ingo Molnar, Frederic Weisbecker, Steven Rostedt,
	kvm-devel, Linux Kernel Mailing List

Avi Kivity wrote:
>  On 09/01/2010 12:04 PM, Peter Zijlstra wrote:
>>
>> Does something like the below cure that?
>>
> 
> Unfortunately not.
> 

Then try this:

The bug should be caused by commit 1c024eca51fdc965290acf342ae16a476c2189d0.

---
diff --git a/kernel/trace/trace_event_perf.c b/kernel/trace/trace_event_perf.c
index 8a2b73f..9d1d1f2 100644
--- a/kernel/trace/trace_event_perf.c
+++ b/kernel/trace/trace_event_perf.c
@@ -161,6 +161,7 @@ void perf_trace_destroy(struct perf_event *p_event)
 		}
 	}
 out:
+	module_put(tp_event->mod);
 	mutex_unlock(&event_mutex);
 }
 

^ permalink raw reply related	[flat|nested] 13+ messages in thread

* Re: ftrace/perf_event leak
  2010-09-01  9:38     ` Li Zefan
@ 2010-09-01  9:41       ` Peter Zijlstra
  2010-09-01 10:38       ` Avi Kivity
  1 sibling, 0 replies; 13+ messages in thread
From: Peter Zijlstra @ 2010-09-01  9:41 UTC (permalink / raw)
  To: Li Zefan
  Cc: Avi Kivity, Ingo Molnar, Frederic Weisbecker, Steven Rostedt,
	kvm-devel, Linux Kernel Mailing List

On Wed, 2010-09-01 at 17:38 +0800, Li Zefan wrote:

> ---
> diff --git a/kernel/trace/trace_event_perf.c b/kernel/trace/trace_event_perf.c
> index 8a2b73f..9d1d1f2 100644
> --- a/kernel/trace/trace_event_perf.c
> +++ b/kernel/trace/trace_event_perf.c
> @@ -161,6 +161,7 @@ void perf_trace_destroy(struct perf_event *p_event)
>  		}
>  	}
>  out:
> +	module_put(tp_event->mod);
>  	mutex_unlock(&event_mutex);
>  }

Ah, indeed, we so that try_module_get() for each reference to the
tracepoint, I guess we also should do the below, in case the
registration fails.

diff --git a/kernel/trace/trace_event_perf.c b/kernel/trace/trace_event_perf.c
index 000e6e8..31cc4cb 100644
--- a/kernel/trace/trace_event_perf.c
+++ b/kernel/trace/trace_event_perf.c
@@ -91,6 +91,8 @@ int perf_trace_init(struct perf_event *p_event)
 		    tp_event->class && tp_event->class->reg &&
 		    try_module_get(tp_event->mod)) {
 			ret = perf_trace_event_init(tp_event, p_event);
+			if (ret)
+				module_put(tp_event->mod);
 			break;
 		}
 	}

^ permalink raw reply related	[flat|nested] 13+ messages in thread

* Re: ftrace/perf_event leak
  2010-09-01  9:38     ` Li Zefan
  2010-09-01  9:41       ` Peter Zijlstra
@ 2010-09-01 10:38       ` Avi Kivity
  2010-09-01 11:02         ` Peter Zijlstra
  1 sibling, 1 reply; 13+ messages in thread
From: Avi Kivity @ 2010-09-01 10:38 UTC (permalink / raw)
  To: Li Zefan
  Cc: Peter Zijlstra, Ingo Molnar, Frederic Weisbecker, Steven Rostedt,
	kvm-devel, Linux Kernel Mailing List

  On 09/01/2010 12:38 PM, Li Zefan wrote:
>
> Then try this:

Tested-by: Avi Kivity <avi@redhat.com>

-- 
I have a truly marvellous patch that fixes the bug which this
signature is too narrow to contain.


^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: ftrace/perf_event leak
  2010-09-01 10:38       ` Avi Kivity
@ 2010-09-01 11:02         ` Peter Zijlstra
  2010-09-01 11:07           ` Ingo Molnar
  2010-09-01 12:15           ` Frederic Weisbecker
  0 siblings, 2 replies; 13+ messages in thread
From: Peter Zijlstra @ 2010-09-01 11:02 UTC (permalink / raw)
  To: Avi Kivity
  Cc: Li Zefan, Ingo Molnar, Frederic Weisbecker, Steven Rostedt,
	kvm-devel, Linux Kernel Mailing List

On Wed, 2010-09-01 at 13:38 +0300, Avi Kivity wrote:
> On 09/01/2010 12:38 PM, Li Zefan wrote:
> >
> > Then try this:
> 
> Tested-by: Avi Kivity <avi@redhat.com>
> 

Thanks, queued as:

---
Subject: perf, trace: Fix module leak
From: Li Zefan <lizf@cn.fujitsu.com>
Date: Wed Sep 01 12:58:43 CEST 2010

Commit 1c024eca (perf, trace: Optimize tracepoints by using
per-tracepoint-per-cpu hlist to track events) caused a module refcount
leak.

Tested-by: Avi Kivity <avi@redhat.com>
Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
LKML-Reference: <4C7E1F12.8030304@cn.fujitsu.com>
---
 kernel/trace/trace_event_perf.c |    3 +++
 1 file changed, 3 insertions(+)

Index: linux-2.6/kernel/trace/trace_event_perf.c
===================================================================
--- linux-2.6.orig/kernel/trace/trace_event_perf.c
+++ linux-2.6/kernel/trace/trace_event_perf.c
@@ -91,6 +91,8 @@ int perf_trace_init(struct perf_event *p
 		    tp_event->class && tp_event->class->reg &&
 		    try_module_get(tp_event->mod)) {
 			ret = perf_trace_event_init(tp_event, p_event);
+			if (ret)
+				module_put(tp_event->mod);
 			break;
 		}
 	}
@@ -147,6 +149,7 @@ void perf_trace_destroy(struct perf_even
 		}
 	}
 out:
+	module_put(tp_event->mod);
 	mutex_unlock(&event_mutex);
 }
 


^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: ftrace/perf_event leak
  2010-09-01 11:02         ` Peter Zijlstra
@ 2010-09-01 11:07           ` Ingo Molnar
  2010-09-01 12:15           ` Frederic Weisbecker
  1 sibling, 0 replies; 13+ messages in thread
From: Ingo Molnar @ 2010-09-01 11:07 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: Avi Kivity, Li Zefan, Frederic Weisbecker, Steven Rostedt,
	kvm-devel, Linux Kernel Mailing List


* Peter Zijlstra <peterz@infradead.org> wrote:

> On Wed, 2010-09-01 at 13:38 +0300, Avi Kivity wrote:
> > On 09/01/2010 12:38 PM, Li Zefan wrote:
> > >
> > > Then try this:
> > 
> > Tested-by: Avi Kivity <avi@redhat.com>
> > 
> 
> Thanks, queued as:
> 
> ---
> Subject: perf, trace: Fix module leak
> From: Li Zefan <lizf@cn.fujitsu.com>
> Date: Wed Sep 01 12:58:43 CEST 2010
> 
> Commit 1c024eca (perf, trace: Optimize tracepoints by using
> per-tracepoint-per-cpu hlist to track events) caused a module refcount
> leak.
> 
> Tested-by: Avi Kivity <avi@redhat.com>
> Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
> LKML-Reference: <4C7E1F12.8030304@cn.fujitsu.com>

Do we need this for -stable too?

	Ingo

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: ftrace/perf_event leak
  2010-09-01 11:02         ` Peter Zijlstra
  2010-09-01 11:07           ` Ingo Molnar
@ 2010-09-01 12:15           ` Frederic Weisbecker
  2010-09-01 13:59             ` Steven Rostedt
  2010-09-02  1:20             ` Li Zefan
  1 sibling, 2 replies; 13+ messages in thread
From: Frederic Weisbecker @ 2010-09-01 12:15 UTC (permalink / raw)
  To: Peter Zijlstra
  Cc: Avi Kivity, Li Zefan, Ingo Molnar, Steven Rostedt, kvm-devel,
	Linux Kernel Mailing List

On Wed, Sep 01, 2010 at 01:02:57PM +0200, Peter Zijlstra wrote:
> On Wed, 2010-09-01 at 13:38 +0300, Avi Kivity wrote:
> > On 09/01/2010 12:38 PM, Li Zefan wrote:
> > >
> > > Then try this:
> > 
> > Tested-by: Avi Kivity <avi@redhat.com>
> > 
> 
> Thanks, queued as:
> 
> ---
> Subject: perf, trace: Fix module leak
> From: Li Zefan <lizf@cn.fujitsu.com>
> Date: Wed Sep 01 12:58:43 CEST 2010
> 
> Commit 1c024eca (perf, trace: Optimize tracepoints by using
> per-tracepoint-per-cpu hlist to track events) caused a module refcount
> leak.
> 
> Tested-by: Avi Kivity <avi@redhat.com>
> Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
> LKML-Reference: <4C7E1F12.8030304@cn.fujitsu.com>
> ---
>  kernel/trace/trace_event_perf.c |    3 +++
>  1 file changed, 3 insertions(+)
> 
> Index: linux-2.6/kernel/trace/trace_event_perf.c
> ===================================================================
> --- linux-2.6.orig/kernel/trace/trace_event_perf.c
> +++ linux-2.6/kernel/trace/trace_event_perf.c
> @@ -91,6 +91,8 @@ int perf_trace_init(struct perf_event *p
>  		    tp_event->class && tp_event->class->reg &&
>  		    try_module_get(tp_event->mod)) {
>  			ret = perf_trace_event_init(tp_event, p_event);
> +			if (ret)
> +				module_put(tp_event->mod);
>  			break;
>  		}
>  	}
> @@ -147,6 +149,7 @@ void perf_trace_destroy(struct perf_even
>  		}
>  	}
>  out:
> +	module_put(tp_event->mod);
>  	mutex_unlock(&event_mutex);
>  }
>  
> 



Thanks for fixing this.

However, can we split this in two patches to ease the backport?

The lack of a module_put() after perf_trace_init() failure is there for a while
(the backport needs to start in 2.6.32).

But the lack of a module_put in the destroy path needs a .35 backport only.


^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: ftrace/perf_event leak
  2010-09-01 12:15           ` Frederic Weisbecker
@ 2010-09-01 13:59             ` Steven Rostedt
  2010-09-01 17:32               ` Ingo Molnar
  2010-09-02  1:20             ` Li Zefan
  1 sibling, 1 reply; 13+ messages in thread
From: Steven Rostedt @ 2010-09-01 13:59 UTC (permalink / raw)
  To: Frederic Weisbecker
  Cc: Peter Zijlstra, Avi Kivity, Li Zefan, Ingo Molnar, kvm-devel,
	Linux Kernel Mailing List

On Wed, 2010-09-01 at 14:15 +0200, Frederic Weisbecker wrote:

> Thanks for fixing this.
> 
> However, can we split this in two patches to ease the backport?
> 
> The lack of a module_put() after perf_trace_init() failure is there for a while
> (the backport needs to start in 2.6.32).
> 
> But the lack of a module_put in the destroy path needs a .35 backport only.

I don't think it really needs two patches. Just notify stable (and 
Greg KH in particular) about the backport requirements. Greg can handle
it ;)

-- Steve



^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: ftrace/perf_event leak
  2010-09-01 13:59             ` Steven Rostedt
@ 2010-09-01 17:32               ` Ingo Molnar
  0 siblings, 0 replies; 13+ messages in thread
From: Ingo Molnar @ 2010-09-01 17:32 UTC (permalink / raw)
  To: Steven Rostedt
  Cc: Frederic Weisbecker, Peter Zijlstra, Avi Kivity, Li Zefan,
	kvm-devel, Linux Kernel Mailing List


* Steven Rostedt <rostedt@goodmis.org> wrote:

> On Wed, 2010-09-01 at 14:15 +0200, Frederic Weisbecker wrote:
> 
> > Thanks for fixing this.
> > 
> > However, can we split this in two patches to ease the backport?
> > 
> > The lack of a module_put() after perf_trace_init() failure is there for a while
> > (the backport needs to start in 2.6.32).
> > 
> > But the lack of a module_put in the destroy path needs a .35 backport only.
> 
> I don't think it really needs two patches. Just notify stable (and 
> Greg KH in particular) about the backport requirements. Greg can 
> handle it ;)

Well, Greg certainly has more than enough to handle, so if there's 
different chunks with different -stable vectors then it would be most 
helpful to him to split things up!

Manually trying to split up patches is both error-prone and 
stress-inducing.

Thanks,

	Ingo

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: ftrace/perf_event leak
  2010-09-01 12:15           ` Frederic Weisbecker
  2010-09-01 13:59             ` Steven Rostedt
@ 2010-09-02  1:20             ` Li Zefan
  1 sibling, 0 replies; 13+ messages in thread
From: Li Zefan @ 2010-09-02  1:20 UTC (permalink / raw)
  To: Frederic Weisbecker
  Cc: Peter Zijlstra, Avi Kivity, Ingo Molnar, Steven Rostedt,
	kvm-devel, Linux Kernel Mailing List

>> Subject: perf, trace: Fix module leak
>> From: Li Zefan <lizf@cn.fujitsu.com>
>> Date: Wed Sep 01 12:58:43 CEST 2010
>>
>> Commit 1c024eca (perf, trace: Optimize tracepoints by using
>> per-tracepoint-per-cpu hlist to track events) caused a module refcount
>> leak.
>>
>> Tested-by: Avi Kivity <avi@redhat.com>
>> Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
>> LKML-Reference: <4C7E1F12.8030304@cn.fujitsu.com>
>> ---
>>  kernel/trace/trace_event_perf.c |    3 +++
>>  1 file changed, 3 insertions(+)
>>
>> Index: linux-2.6/kernel/trace/trace_event_perf.c
>> ===================================================================
>> --- linux-2.6.orig/kernel/trace/trace_event_perf.c
>> +++ linux-2.6/kernel/trace/trace_event_perf.c
>> @@ -91,6 +91,8 @@ int perf_trace_init(struct perf_event *p
>>  		    tp_event->class && tp_event->class->reg &&
>>  		    try_module_get(tp_event->mod)) {
>>  			ret = perf_trace_event_init(tp_event, p_event);
>> +			if (ret)
>> +				module_put(tp_event->mod);
>>  			break;
>>  		}
>>  	}
>> @@ -147,6 +149,7 @@ void perf_trace_destroy(struct perf_even
>>  		}
>>  	}
>>  out:
>> +	module_put(tp_event->mod);
>>  	mutex_unlock(&event_mutex);
>>  }
>>  
>>
> 
> Thanks for fixing this.
> 
> However, can we split this in two patches to ease the backport?
> 
> The lack of a module_put() after perf_trace_init() failure is there for a while
> (the backport needs to start in 2.6.32).

The failure should be a rare case, I don't think this has to be backported?

> 
> But the lack of a module_put in the destroy path needs a .35 backport only.
> 


^ permalink raw reply	[flat|nested] 13+ messages in thread

end of thread, other threads:[~2010-09-02  1:15 UTC | newest]

Thread overview: 13+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2010-09-01  8:42 ftrace/perf_event leak Avi Kivity
2010-09-01  9:04 ` Peter Zijlstra
2010-09-01  9:26   ` Avi Kivity
2010-09-01  9:35     ` Peter Zijlstra
2010-09-01  9:38     ` Li Zefan
2010-09-01  9:41       ` Peter Zijlstra
2010-09-01 10:38       ` Avi Kivity
2010-09-01 11:02         ` Peter Zijlstra
2010-09-01 11:07           ` Ingo Molnar
2010-09-01 12:15           ` Frederic Weisbecker
2010-09-01 13:59             ` Steven Rostedt
2010-09-01 17:32               ` Ingo Molnar
2010-09-02  1:20             ` Li Zefan

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).