From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1762894AbcALRPY (ORCPT ); Tue, 12 Jan 2016 12:15:24 -0500 Received: from mail.kernel.org ([198.145.29.136]:39041 "EHLO mail.kernel.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1753101AbcALRPX (ORCPT ); Tue, 12 Jan 2016 12:15:23 -0500 Date: Tue, 12 Jan 2016 14:15:17 -0300 From: Arnaldo Carvalho de Melo To: Peter Zijlstra Cc: Ingo Molnar , Stephane Eranian , Namhyung Kim , LKML , Jiri Olsa , Namhyung Kim , Adrian Hunter , "ak@linux.intel.com" Subject: Re: [RFC] perf record: missing buildid for callstack modules Message-ID: <20160112171517.GI18367@kernel.org> References: <20160111173036.GA6344@twins.programming.kicks-ass.net> <20160112103943.GA6310@gmail.com> <20160112113521.GC6357@twins.programming.kicks-ass.net> <20160112121805.GA32199@gmail.com> <20160112134012.GF6357@twins.programming.kicks-ass.net> <20160112143805.GX18367@kernel.org> <20160112153440.GJ6357@twins.programming.kicks-ass.net> <20160112154812.GH18367@kernel.org> <20160112161027.GN6357@twins.programming.kicks-ass.net> <20160112162719.GX6373@twins.programming.kicks-ass.net> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20160112162719.GX6373@twins.programming.kicks-ass.net> X-Url: http://acmel.wordpress.com User-Agent: Mutt/1.5.23 (2014-03-12) Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Em Tue, Jan 12, 2016 at 05:27:19PM +0100, Peter Zijlstra escreveu: > On Tue, Jan 12, 2016 at 05:10:27PM +0100, Peter Zijlstra wrote: > > > How will we collect the mtime for the DSOs in PERF_RECORD_MMAP records > > > if we don't look at those records? > > Kernel side, the vma has a vm_file member. So Gotcha, via PERF_RECORD_MMAP3, using something we have (mtime) that while not good as a content-based cookie (that we don't easily have at PERF_RECORD_MMAP3 generation time) will allow us to do minimal detection of mismatched DSOs at analysis time. > > vma->vm_file->f_inode->i_mtime will get us that for all file based > > mmap()s. > > (or maybe ctime, not sure how popular it is these days to switch off > > mtime accounting). > And this would (obviously) be a good time to see what else we'd like to > stuff in there. Perhaps a version number? Else we'll be using those reserved bits again when we decide to introduce MMAP4 ;-\ But I still think we should have a content-based cookie to be delivered via that record, so at a minimum leave a cookie there starting with a size, that, for now, would be zero. If present, that would get into the current build-id infrastructure existing in the tooling side, something like: + /* + * The MMAP3 records are an augmented version of MMAP2, they add + * mtime and an optional content-based cookie, if available. + * struct perf_event_header header; + * + * u32 pid, tid; + * u64 addr; + * u64 len; + * u64 pgoff; + * u32 maj; + * u32 min; + * u64 ino; + * u64 ino_generation; + * u32 prot, flags; + * u64 mtime; + * u32 cookie_len + * char cookie[2+]; + * char filename[]; + * struct sample_id sample_id; Then someone (I want to do this, after processing more stuff from a neverending backlog) should try to experiment with the varios ELF loaders, etc to, in the process of loading, set that content-based cookie somehow (ioctl? prctl? whatever). - Arnaldo > --- > include/uapi/linux/perf_event.h | 26 +++++++++++++++++++++++++- > kernel/events/core.c | 19 +++++++++++++++---- > 2 files changed, 40 insertions(+), 5 deletions(-) > > diff --git a/include/uapi/linux/perf_event.h b/include/uapi/linux/perf_event.h > index 1afe9623c1a7..a0f43140a0e2 100644 > --- a/include/uapi/linux/perf_event.h > +++ b/include/uapi/linux/perf_event.h > @@ -340,7 +340,8 @@ struct perf_event_attr { > comm_exec : 1, /* flag comm events that are due to an exec */ > use_clockid : 1, /* use @clockid for time fields */ > context_switch : 1, /* context switch data */ > - __reserved_1 : 37; > + mmap3 : 1, /* include mmap with mtime */ > + __reserved_1 : 36; > > union { > __u32 wakeup_events; /* wakeup every n events */ > @@ -856,6 +857,29 @@ enum perf_event_type { > */ > PERF_RECORD_SWITCH_CPU_WIDE = 15, > > + /* > + * The MMAP3 records are an augmented version of MMAP2, they add > + * mtime. > + * > + * struct { > + * struct perf_event_header header; > + * > + * u32 pid, tid; > + * u64 addr; > + * u64 len; > + * u64 pgoff; > + * u32 maj; > + * u32 min; > + * u64 ino; > + * u64 ino_generation; > + * u32 prot, flags; > + * u64 mtime; > + * char filename[]; > + * struct sample_id sample_id; > + * } > + */ > + PERF_RECORD_MMAP3 = 16, > + > PERF_RECORD_MAX, /* non-ABI */ > }; > > diff --git a/kernel/events/core.c b/kernel/events/core.c > index bf8244190d0f..d400da14b923 100644 > --- a/kernel/events/core.c > +++ b/kernel/events/core.c > @@ -5661,7 +5661,7 @@ perf_event_aux(perf_event_aux_output_cb output, void *data, > /* > * task tracking -- fork/exit > * > - * enabled by: attr.comm | attr.mmap | attr.mmap2 | attr.mmap_data | attr.task > + * enabled by: attr.comm | attr.mmap | attr.mmap2 | attr.mmap3 | attr.mmap_data | attr.task > */ > > struct perf_task_event { > @@ -5682,8 +5682,8 @@ struct perf_task_event { > static int perf_event_task_match(struct perf_event *event) > { > return event->attr.comm || event->attr.mmap || > - event->attr.mmap2 || event->attr.mmap_data || > - event->attr.task; > + event->attr.mmap2 || event->attr.mmap3 || > + event->attr.mmap_data || event->attr.task; > } > > static void perf_event_task_output(struct perf_event *event, > @@ -5872,6 +5872,7 @@ struct perf_mmap_event { > u64 ino; > u64 ino_generation; > u32 prot, flags; > + u64 mtime; > > struct { > struct perf_event_header header; > @@ -5892,7 +5893,7 @@ static int perf_event_mmap_match(struct perf_event *event, > int executable = vma->vm_flags & VM_EXEC; > > return (!executable && event->attr.mmap_data) || > - (executable && (event->attr.mmap || event->attr.mmap2)); > + (executable && (event->attr.mmap || event->attr.mmap2 || event-attr.mmap3)); > } > > static void perf_event_mmap_output(struct perf_event *event, > @@ -5916,6 +5917,9 @@ static void perf_event_mmap_output(struct perf_event *event, > mmap_event->event_id.header.size += sizeof(mmap_event->prot); > mmap_event->event_id.header.size += sizeof(mmap_event->flags); > } > + if (event->attr.mmap3) { > + mmap_event->event_id.header.size += sizeof(mmap_event->mtime); > + } > > perf_event_header__init_id(&mmap_event->event_id.header, &sample, event); > ret = perf_output_begin(&handle, event, > @@ -5936,6 +5940,9 @@ static void perf_event_mmap_output(struct perf_event *event, > perf_output_put(&handle, mmap_event->prot); > perf_output_put(&handle, mmap_event->flags); > } > + if (event->attr.mmap3) { > + perf_output_put(&handle, mmap_event->mtime); > + } > > __output_copy(&handle, mmap_event->file_name, > mmap_event->file_size); > @@ -5954,6 +5961,7 @@ static void perf_event_mmap_event(struct perf_mmap_event *mmap_event) > int maj = 0, min = 0; > u64 ino = 0, gen = 0; > u32 prot = 0, flags = 0; > + u64 mtime = 0; > unsigned int size; > char tmp[16]; > char *buf = NULL; > @@ -5984,6 +5992,7 @@ static void perf_event_mmap_event(struct perf_mmap_event *mmap_event) > gen = inode->i_generation; > maj = MAJOR(dev); > min = MINOR(dev); > + mtime = timespec_to_ns(inode->i_mtime); > > if (vma->vm_flags & VM_READ) > prot |= PROT_READ; > @@ -6054,6 +6063,7 @@ static void perf_event_mmap_event(struct perf_mmap_event *mmap_event) > mmap_event->ino_generation = gen; > mmap_event->prot = prot; > mmap_event->flags = flags; > + mmap_event->mtime = mtime; > > if (!(vma->vm_flags & VM_EXEC)) > mmap_event->event_id.header.misc |= PERF_RECORD_MISC_MMAP_DATA; > @@ -6096,6 +6106,7 @@ void perf_event_mmap(struct vm_area_struct *vma) > /* .ino_generation (attr_mmap2 only) */ > /* .prot (attr_mmap2 only) */ > /* .flags (attr_mmap2 only) */ > + /* .mtime (attr_mmap3 only) */ > }; > > perf_event_mmap_event(&mmap_event);