From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1752819Ab2JRTdg (ORCPT ); Thu, 18 Oct 2012 15:33:36 -0400 Received: from mail-la0-f46.google.com ([209.85.215.46]:57684 "EHLO mail-la0-f46.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751297Ab2JRTdf (ORCPT ); Thu, 18 Oct 2012 15:33:35 -0400 MIME-Version: 1.0 In-Reply-To: <1350408232.2336.42.camel@laptop> References: <1350408232.2336.42.camel@laptop> Date: Thu, 18 Oct 2012 21:33:33 +0200 Message-ID: Subject: Re: [RFC] perf: need to expose sched_clock to correlate user samples with kernel samples From: Stephane Eranian To: Peter Zijlstra Cc: LKML , "mingo@elte.hu" , Paul Mackerras , Anton Blanchard , Will Deacon , "ak@linux.intel.com" , Pekka Enberg , Steven Rostedt , Robert Richter , tglx , John Stultz Content-Type: text/plain; charset=UTF-8 X-System-Of-Record: true Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Tue, Oct 16, 2012 at 7:23 PM, Peter Zijlstra wrote: > On Tue, 2012-10-16 at 12:13 +0200, Stephane Eranian wrote: >> Hi, >> >> There are many situations where we want to correlate events happening at >> the user level with samples recorded in the perf_event kernel sampling buffer. >> For instance, we might want to correlate the call to a function or creation of >> a file with samples. Similarly, when we want to monitor a JVM with jitted code, >> we need to be able to correlate jitted code mappings with perf event samples >> for symbolization. >> >> Perf_events allows timestamping of samples with PERF_SAMPLE_TIME. >> That causes each PERF_RECORD_SAMPLE to include a timestamp >> generated by calling the local_clock() -> sched_clock_cpu() function. >> >> To make correlating user vs. kernel samples easy, we would need to >> access that sched_clock() functionality. However, none of the existing >> clock calls permit this at this point. They all return timestamps which are >> not using the same source and/or offset as sched_clock. >> >> I believe a similar issue exists with the ftrace subsystem. >> >> The problem needs to be adressed in a portable manner. Solutions >> based on reading TSC for the user level to reconstruct sched_clock() >> don't seem appropriate to me. >> >> One possibility to address this limitation would be to extend clock_gettime() >> with a new clock time, e.g., CLOCK_PERF. >> >> However, I understand that sched_clock_cpu() provides ordering guarantees only >> when invoked on the same CPU repeatedly, i.e., it's not globally synchronized. >> But we already have to deal with this problem when merging samples obtained >> from different CPU sampling buffer in per-thread mode. So this is not >> necessarily >> a showstopper. >> >> Alternatives could be to use uprobes but that's less practical to setup. >> >> Anyone with better ideas? > > You forgot to CC the time people ;-) > I did not know where they were. > I've no problem with adding CLOCK_PERF (or another/better name). > Ok, good. > Thomas, John? > Any comment?