From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S933454Ab3BSSZ4 (ORCPT ); Tue, 19 Feb 2013 13:25:56 -0500 Received: from mail-pb0-f46.google.com ([209.85.160.46]:55591 "EHLO mail-pb0-f46.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S933285Ab3BSSZz (ORCPT ); Tue, 19 Feb 2013 13:25:55 -0500 Message-ID: <5123C3AF.8060100@linaro.org> Date: Tue, 19 Feb 2013 10:25:51 -0800 From: John Stultz User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:17.0) Gecko/20130106 Thunderbird/17.0.2 MIME-Version: 1.0 To: Thomas Gleixner CC: Stephane Eranian , Pawel Moll , Peter Zijlstra , LKML , "mingo@elte.hu" , Paul Mackerras , Anton Blanchard , Will Deacon , "ak@linux.intel.com" , Pekka Enberg , Steven Rostedt , Robert Richter Subject: Re: [RFC] perf: need to expose sched_clock to correlate user samples with kernel samples References: <1350408232.2336.42.camel@laptop> <1359728280.8360.15.camel@hornet> <51118797.9080800@linaro.org> In-Reply-To: Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On 02/18/2013 12:35 PM, Thomas Gleixner wrote: > On Tue, 5 Feb 2013, John Stultz wrote: >> On 02/05/2013 02:13 PM, Stephane Eranian wrote: >>> But if people are strongly opposed to the clock_gettime() approach, then >>> I can go with the ioctl() because the functionality is definitively needed >>> ASAP. >> I prefer the ioctl method, since its less likely to be re-purposed/misused. > Urgh. No! With a dedicated CLOCK_PERF we might have a decent chance to > put this into a vsyscall. With an ioctl not so much. > >> Though I'd be most comfortable with finding some way for perf-timestamps to be >> CLOCK_MONOTONIC based (or maybe CLOCK_MONOTONIC_RAW if it would be easier), >> and just avoid all together adding another time domain that doesn't really >> have clear definition (other then "what perf uses"). > What's wrong with that. We already have the infrastructure to create > dynamic time domains which can be completely disconnected from > everything else. Right, but those are for actual hardware domains that we had no other way of interacting with. > Tracing/perf/instrumentation is a different domain and the main issue > there is performance. So going for a vsyscall enabled clock_gettime() > approach is definitely the best thing to do. So describe how the perf time domain is different then CLOCK_MONOTONIC_RAW. My concern here is that we're basically creating a kernel interface that exports implementation-defined semantics (again: whatever perf does right now). And I think folks want to do this, because adding CLOCK_PERF is easier then trying to: 1) Get a lock-free method for accessing CLOCK_MONOTONIC_RAW 2) Having perf interpolate its timestamps to CLOCK_MONOTONIC, or CLOCKMONOTONIC_RAW when it exports the data The semantics on sched_clock() have been very flexible and hand-wavy in the past. And I agree with the need for the kernel to have a "fast-and-loose" clock as well as the benefits to that flexibility as the scheduler code has evolved. But non-the-less, the changes in its semantics have bitten us badly a few times. So I totally understand why the vsyscall is attractive. I'm just very cautious about exporting a similarly fuzzily defined interface to userland. So until its clear what the semantics will need to be going forward (forever!), my preference will be that we not add it. thanks -john