* perf: record segfaults for cycles event when collecting data on a VM
@ 2012-02-08 16:55 David Ahern
2012-02-08 17:44 ` Joerg Roedel
0 siblings, 1 reply; 11+ messages in thread
From: David Ahern @ 2012-02-08 16:55 UTC (permalink / raw)
To: joerg.roedel, Arnaldo Carvalho de Melo, LKML
$ ps -p 21483
PID TTY TIME CMD
21483 pts/3 00:30:20 qemu-kvm
perf record -e cycles -p 21483 -- sleep 1
[ perf record: Woken up 1 times to write data ]
[ perf record: Captured and wrote 0.012 MB perf.data (~503 samples) ]
Segmentation fault
git bisect points to:
1aed2671738785e8f5aea663a6fda91aa7ef59b5 is the first bad commit
commit 1aed2671738785e8f5aea663a6fda91aa7ef59b5
Author: Joerg Roedel <joerg.roedel@amd.com>
Date: Wed Jan 4 17:54:20 2012 +0100
perf kvm: Do guest-only counting by default
Make use of exclude_guest and exlude_host in perf-kvm to do only
guest-only counting by default.
Cc: Gleb Natapov <gleb@redhat.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Joerg Roedel <joro@8bytes.org>
Cc: Peter Zijlstra <a.p.zijlstra@chello.nl>
Signed-off-by: Gleb Natapov <gleb@redhat.com>
Signed-off-by: Joerg Roedel <joerg.roedel@amd.com>
[ committer note: Moved perf_{guest,host} & event_attr_init to util.c ]
[ so as not to drag more stuff to the python binding]
Signed-off-by: Arnaldo Carvalho de Melo <acme@redhat.com>
When processing the events at the end of the collection machine comes as
NULL. Top part of the stack trace:
#0 0x00000000004667c2 in machine__findnew_thread (self=0x0, pid=21483)
at util/thread.c:75
p = 0x28
parent = 0x0
th = 0x8e51b0
#1 0x00000000004395bc in build_id__mark_dso_hit (tool=0x7802e0,
event=0x7ffff7ff9df8,
sample=0x7fffffffe030, evsel=0x8e4c80, machine=0x0) at
util/build-id.c:27
al = {thread = 0x7fffffffe030, map = 0x6f68f12cf0d73400, sym =
0x7fffffffdfa0,
addr = 140737488347184, level = -8 '\370', filtered = 157,
cpumode = 255 '\377', cpu = 32767}
cpumode = 4 '\004'
thread = 0x8e50a0
#2 0x0000000000464aed in perf_session_deliver_event (session=0x8e50a0,
event=0x7ffff7ff9df8,
sample=0x7fffffffe030, tool=0x7802e0, file_offset=11768) at
util/session.c:799
evsel = 0x8e4c80
machine = 0x0
#3 0x0000000000464fd4 in perf_session__process_event (session=0x8e50a0,
event=0x7ffff7ff9df8,
tool=0x7802e0, file_offset=11768) at util/session.c:907
sample = {ip = 18446744071579083331, pid = 21483, tid = 21487,
time = 18446744073709551615,
addr = 0, id = 18446744073709551615, stream_id =
18446744073709551615, period = 10325467,
cpu = 4294967295, raw_size = 0, raw_data = 0x0, callchain = 0x0}
ret = 0
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: perf: record segfaults for cycles event when collecting data on a VM
2012-02-08 16:55 perf: record segfaults for cycles event when collecting data on a VM David Ahern
@ 2012-02-08 17:44 ` Joerg Roedel
2012-02-08 17:53 ` David Ahern
0 siblings, 1 reply; 11+ messages in thread
From: Joerg Roedel @ 2012-02-08 17:44 UTC (permalink / raw)
To: David Ahern; +Cc: Arnaldo Carvalho de Melo, LKML, Jason Wang
On Wed, Feb 08, 2012 at 09:55:35AM -0700, David Ahern wrote:
> perf record -e cycles -p 21483 -- sleep 1
> [ perf record: Woken up 1 times to write data ]
> [ perf record: Captured and wrote 0.012 MB perf.data (~503 samples) ]
> Segmentation fault
The problem is similar to the one Jason is seeing. I am working on a fix
right now.
Bottom line is that the perf-tool may receive samples tagged as
GUEST_KERNEL even when guest-sampling is disabled (probably a
race-condition). The perf-tool can not find a valid machine pointer for
such a sample and passes NULL down to the other functions. And some
functions don't seem to handle this.
David, can you try to change the default for perf_guest back to false
amd re-test? Not with 'sleep 1' probably, on my setup it takes a busy
guest and a few seconds to trigger.
Joerg
--
AMD Operating System Research Center
Advanced Micro Devices GmbH Einsteinring 24 85609 Dornach
General Managers: Alberto Bozzo
Registration: Dornach, Landkr. Muenchen; Registerger. Muenchen, HRB Nr. 43632
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: perf: record segfaults for cycles event when collecting data on a VM
2012-02-08 17:44 ` Joerg Roedel
@ 2012-02-08 17:53 ` David Ahern
2012-02-08 17:57 ` Joerg Roedel
0 siblings, 1 reply; 11+ messages in thread
From: David Ahern @ 2012-02-08 17:53 UTC (permalink / raw)
To: Joerg Roedel; +Cc: Arnaldo Carvalho de Melo, LKML, Jason Wang
On 02/08/2012 10:44 AM, Joerg Roedel wrote:
> On Wed, Feb 08, 2012 at 09:55:35AM -0700, David Ahern wrote:
>> perf record -e cycles -p 21483 -- sleep 1
>> [ perf record: Woken up 1 times to write data ]
>> [ perf record: Captured and wrote 0.012 MB perf.data (~503 samples) ]
>> Segmentation fault
>
> The problem is similar to the one Jason is seeing. I am working on a fix
> right now.
> Bottom line is that the perf-tool may receive samples tagged as
> GUEST_KERNEL even when guest-sampling is disabled (probably a
> race-condition). The perf-tool can not find a valid machine pointer for
> such a sample and passes NULL down to the other functions. And some
> functions don't seem to handle this.
>
> David, can you try to change the default for perf_guest back to false
> amd re-test? Not with 'sleep 1' probably, on my setup it takes a busy
> guest and a few seconds to trigger.
The segfault is in the event processing once the collection is done
(perf-record), so the time length should not matter.
Either way, this fixes the segafult:
diff --git a/tools/perf/util/util.c b/tools/perf/util/util.c
index 8131410..fb25d13 100644
--- a/tools/perf/util/util.c
+++ b/tools/perf/util/util.c
@@ -6,7 +6,7 @@
* XXX We need to find a better place for these things...
*/
bool perf_host = true;
-bool perf_guest = true;
+bool perf_guest = false;
void event_attr_init(struct perf_event_attr *attr)
{
which makes sense. It forces perf_session__find_machine_for_cpumode() to
return the host machine always.
David
>
>
> Joerg
>
^ permalink raw reply related [flat|nested] 11+ messages in thread
* Re: perf: record segfaults for cycles event when collecting data on a VM
2012-02-08 17:53 ` David Ahern
@ 2012-02-08 17:57 ` Joerg Roedel
2012-02-09 7:30 ` Ingo Molnar
0 siblings, 1 reply; 11+ messages in thread
From: Joerg Roedel @ 2012-02-08 17:57 UTC (permalink / raw)
To: David Ahern; +Cc: Arnaldo Carvalho de Melo, LKML, Jason Wang
On Wed, Feb 08, 2012 at 10:53:04AM -0700, David Ahern wrote:
> The segfault is in the event processing once the collection is done
> (perf-record), so the time length should not matter.
Yes, but I wanted to make sure there is a GUEST_KERNEL sample in there
to process. It usually takes some time (at least in my tests) to get a
spurious one.
> Either way, this fixes the segafult:
>
> diff --git a/tools/perf/util/util.c b/tools/perf/util/util.c
> index 8131410..fb25d13 100644
> --- a/tools/perf/util/util.c
> +++ b/tools/perf/util/util.c
> @@ -6,7 +6,7 @@
> * XXX We need to find a better place for these things...
> */
> bool perf_host = true;
> -bool perf_guest = true;
> +bool perf_guest = false;
>
> void event_attr_init(struct perf_event_attr *attr)
> {
>
> which makes sense. It forces perf_session__find_machine_for_cpumode() to
> return the host machine always.
Great, thanks. I will send two patches tomorrow to fix Jason's problem
and change the default for perf_guest.
Joerg
--
AMD Operating System Research Center
Advanced Micro Devices GmbH Einsteinring 24 85609 Dornach
General Managers: Alberto Bozzo
Registration: Dornach, Landkr. Muenchen; Registerger. Muenchen, HRB Nr. 43632
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: perf: record segfaults for cycles event when collecting data on a VM
2012-02-08 17:57 ` Joerg Roedel
@ 2012-02-09 7:30 ` Ingo Molnar
2012-02-09 11:14 ` Joerg Roedel
0 siblings, 1 reply; 11+ messages in thread
From: Ingo Molnar @ 2012-02-09 7:30 UTC (permalink / raw)
To: Joerg Roedel; +Cc: David Ahern, Arnaldo Carvalho de Melo, LKML, Jason Wang
* Joerg Roedel <joerg.roedel@amd.com> wrote:
> > which makes sense. It forces
> > perf_session__find_machine_for_cpumode() to return the host
> > machine always.
>
> Great, thanks. I will send two patches tomorrow to fix Jason's
> problem and change the default for perf_guest.
Well, if the crash is fixed then the the default can stay,
right?
Generally we should treat all input data in a perf.data or even
the bits we get in the ring-buffer as external data that has to
be checked carefully, with no assumptions made about data.
Thanks,
Ingo
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: perf: record segfaults for cycles event when collecting data on a VM
2012-02-09 7:30 ` Ingo Molnar
@ 2012-02-09 11:14 ` Joerg Roedel
2012-02-09 13:34 ` Ingo Molnar
0 siblings, 1 reply; 11+ messages in thread
From: Joerg Roedel @ 2012-02-09 11:14 UTC (permalink / raw)
To: Ingo Molnar; +Cc: David Ahern, Arnaldo Carvalho de Melo, LKML, Jason Wang
On Thu, Feb 09, 2012 at 08:30:24AM +0100, Ingo Molnar wrote:
>
> * Joerg Roedel <joerg.roedel@amd.com> wrote:
>
> > > which makes sense. It forces
> > > perf_session__find_machine_for_cpumode() to return the host
> > > machine always.
> >
> > Great, thanks. I will send two patches tomorrow to fix Jason's
> > problem and change the default for perf_guest.
>
> Well, if the crash is fixed then the the default can stay,
> right?
David's crash is fixed by changing the default back to its original
value :)
> Generally we should treat all input data in a perf.data or even
> the bits we get in the ring-buffer as external data that has to
> be checked carefully, with no assumptions made about data.
Well, there are two options:
1) Make sure machine == NULL does not happen. Changing the
default of perf_guest back to false does exactly this for
David's problem.
2) Make sure that a machine == NULL pointer is never
dereferenced
I was going to fix it with option 1. Do you suggest option 2 is better?
Joerg
--
AMD Operating System Research Center
Advanced Micro Devices GmbH Einsteinring 24 85609 Dornach
General Managers: Alberto Bozzo
Registration: Dornach, Landkr. Muenchen; Registerger. Muenchen, HRB Nr. 43632
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: perf: record segfaults for cycles event when collecting data on a VM
2012-02-09 11:14 ` Joerg Roedel
@ 2012-02-09 13:34 ` Ingo Molnar
2012-02-09 14:32 ` Joerg Roedel
2012-02-09 14:43 ` David Ahern
0 siblings, 2 replies; 11+ messages in thread
From: Ingo Molnar @ 2012-02-09 13:34 UTC (permalink / raw)
To: Joerg Roedel; +Cc: David Ahern, Arnaldo Carvalho de Melo, LKML, Jason Wang
* Joerg Roedel <joerg.roedel@amd.com> wrote:
> On Thu, Feb 09, 2012 at 08:30:24AM +0100, Ingo Molnar wrote:
> >
> > * Joerg Roedel <joerg.roedel@amd.com> wrote:
> >
> > > > which makes sense. It forces
> > > > perf_session__find_machine_for_cpumode() to return the host
> > > > machine always.
> > >
> > > Great, thanks. I will send two patches tomorrow to fix Jason's
> > > problem and change the default for perf_guest.
> >
> > Well, if the crash is fixed then the the default can stay,
> > right?
>
> David's crash is fixed by changing the default back to its
> original value :)
Then that's the wrong fix really.
> > Generally we should treat all input data in a perf.data or even
> > the bits we get in the ring-buffer as external data that has to
> > be checked carefully, with no assumptions made about data.
>
> Well, there are two options:
>
> 1) Make sure machine == NULL does not happen. Changing the
> default of perf_guest back to false does exactly this for
> David's problem.
So what if it's turned on by the user? Do we still crash
occasionally?
> 2) Make sure that a machine == NULL pointer is never
> dereferenced
>
> I was going to fix it with option 1. Do you suggest option 2 is better?
Looks like the better fix. You said:
> Bottom line is that the perf-tool may receive samples tagged
> as GUEST_KERNEL even when guest-sampling is disabled (probably
> a race-condition). The perf-tool can not find a valid machine
> pointer for such a sample and passes NULL down to the other
> functions. And some functions don't seem to handle this.
tooling should never be surprised by getting some unexpected
sample via the perf.data or the ring-buffer - regardless of
whether that functionality is default enabled or manually
enabled.
Thanks,
Ingo
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: perf: record segfaults for cycles event when collecting data on a VM
2012-02-09 13:34 ` Ingo Molnar
@ 2012-02-09 14:32 ` Joerg Roedel
2012-02-09 15:28 ` Joerg Roedel
2012-02-09 14:43 ` David Ahern
1 sibling, 1 reply; 11+ messages in thread
From: Joerg Roedel @ 2012-02-09 14:32 UTC (permalink / raw)
To: Ingo Molnar; +Cc: David Ahern, Arnaldo Carvalho de Melo, LKML, Jason Wang
On Thu, Feb 09, 2012 at 02:34:47PM +0100, Ingo Molnar wrote:
>
> * Joerg Roedel <joerg.roedel@amd.com> wrote:
> > Well, there are two options:
> >
> > 1) Make sure machine == NULL does not happen. Changing the
> > default of perf_guest back to false does exactly this for
> > David's problem.
>
> So what if it's turned on by the user? Do we still crash
> occasionally?
It is only turned on by perf-kvm, and this path should setup a machine
object for guest samples.
>
> > 2) Make sure that a machine == NULL pointer is never
> > dereferenced
> >
> > I was going to fix it with option 1. Do you suggest option 2 is better?
>
> Looks like the better fix. You said:
>
> > Bottom line is that the perf-tool may receive samples tagged
> > as GUEST_KERNEL even when guest-sampling is disabled (probably
> > a race-condition). The perf-tool can not find a valid machine
> > pointer for such a sample and passes NULL down to the other
> > functions. And some functions don't seem to handle this.
>
> tooling should never be surprised by getting some unexpected
> sample via the perf.data or the ring-buffer - regardless of
> whether that functionality is default enabled or manually
> enabled.
Yeah, right. Guest samples may also show up intentionally when the event
modifiers are used. So crashing on machine==NULL needs to be fixed.
Joerg
--
AMD Operating System Research Center
Advanced Micro Devices GmbH Einsteinring 24 85609 Dornach
General Managers: Alberto Bozzo
Registration: Dornach, Landkr. Muenchen; Registerger. Muenchen, HRB Nr. 43632
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: perf: record segfaults for cycles event when collecting data on a VM
2012-02-09 13:34 ` Ingo Molnar
2012-02-09 14:32 ` Joerg Roedel
@ 2012-02-09 14:43 ` David Ahern
1 sibling, 0 replies; 11+ messages in thread
From: David Ahern @ 2012-02-09 14:43 UTC (permalink / raw)
To: Ingo Molnar, Joerg Roedel; +Cc: Arnaldo Carvalho de Melo, LKML, Jason Wang
On 02/09/2012 06:34 AM, Ingo Molnar wrote:
>
> * Joerg Roedel <joerg.roedel@amd.com> wrote:
>
>> On Thu, Feb 09, 2012 at 08:30:24AM +0100, Ingo Molnar wrote:
>>>
>>> * Joerg Roedel <joerg.roedel@amd.com> wrote:
>>>
>>>>> which makes sense. It forces
>>>>> perf_session__find_machine_for_cpumode() to return the host
>>>>> machine always.
>>>>
>>>> Great, thanks. I will send two patches tomorrow to fix Jason's
>>>> problem and change the default for perf_guest.
>>>
>>> Well, if the crash is fixed then the the default can stay,
>>> right?
>>
>> David's crash is fixed by changing the default back to its
>> original value :)
>
> Then that's the wrong fix really.
You can't enable perf_guest unless guest info has been configured as
done with perf-kvm.
The problem is triggered in perf_session__find_machine_for_cpumode(). If
perf_session__find_machine() is changed to
perf_session__findnew_machine() you get into machines__findnew() which
shows the root cause: the event is for a VM but symbol_conf.guestmount
is not set.
So, really perf_guest cannot be enabled unless guest info has been given.
David
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: perf: record segfaults for cycles event when collecting data on a VM
2012-02-09 14:32 ` Joerg Roedel
@ 2012-02-09 15:28 ` Joerg Roedel
2012-02-09 15:47 ` David Ahern
0 siblings, 1 reply; 11+ messages in thread
From: Joerg Roedel @ 2012-02-09 15:28 UTC (permalink / raw)
To: Ingo Molnar; +Cc: David Ahern, Arnaldo Carvalho de Melo, LKML, Jason Wang
On Thu, Feb 09, 2012 at 03:32:53PM +0100, Joerg Roedel wrote:
> Yeah, right. Guest samples may also show up intentionally when the event
> modifiers are used. So crashing on machine==NULL needs to be fixed.
Hmm, on the other hand, there is no way to specify guest symbol
information to the regular perf subcommands except for perf-kvm. Without
that information perf can not resolve any GUEST_KERNEL samples, so in
the end there is no point in setting perf_guest to true except for
perf-kvm, no?
Joerg
--
AMD Operating System Research Center
Advanced Micro Devices GmbH Einsteinring 24 85609 Dornach
General Managers: Alberto Bozzo
Registration: Dornach, Landkr. Muenchen; Registerger. Muenchen, HRB Nr. 43632
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: perf: record segfaults for cycles event when collecting data on a VM
2012-02-09 15:28 ` Joerg Roedel
@ 2012-02-09 15:47 ` David Ahern
0 siblings, 0 replies; 11+ messages in thread
From: David Ahern @ 2012-02-09 15:47 UTC (permalink / raw)
To: Joerg Roedel; +Cc: Ingo Molnar, Arnaldo Carvalho de Melo, LKML, Jason Wang
On 02/09/2012 08:28 AM, Joerg Roedel wrote:
> On Thu, Feb 09, 2012 at 03:32:53PM +0100, Joerg Roedel wrote:
>> Yeah, right. Guest samples may also show up intentionally when the event
>> modifiers are used. So crashing on machine==NULL needs to be fixed.
>
> Hmm, on the other hand, there is no way to specify guest symbol
> information to the regular perf subcommands except for perf-kvm. Without
> that information perf can not resolve any GUEST_KERNEL samples, so in
> the end there is no point in setting perf_guest to true except for
> perf-kvm, no?
Exactly. That was my point in my last reply.
^ permalink raw reply [flat|nested] 11+ messages in thread
end of thread, other threads:[~2012-02-09 15:47 UTC | newest]
Thread overview: 11+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2012-02-08 16:55 perf: record segfaults for cycles event when collecting data on a VM David Ahern
2012-02-08 17:44 ` Joerg Roedel
2012-02-08 17:53 ` David Ahern
2012-02-08 17:57 ` Joerg Roedel
2012-02-09 7:30 ` Ingo Molnar
2012-02-09 11:14 ` Joerg Roedel
2012-02-09 13:34 ` Ingo Molnar
2012-02-09 14:32 ` Joerg Roedel
2012-02-09 15:28 ` Joerg Roedel
2012-02-09 15:47 ` David Ahern
2012-02-09 14:43 ` David Ahern
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).