From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1753065AbbDMHG2 (ORCPT ); Mon, 13 Apr 2015 03:06:28 -0400 Received: from mga01.intel.com ([192.55.52.88]:44116 "EHLO mga01.intel.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1752174AbbDMHGZ (ORCPT ); Mon, 13 Apr 2015 03:06:25 -0400 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.11,568,1422950400"; d="scan'208";a="707951751" Message-ID: <552B6A51.3040106@intel.com> Date: Mon, 13 Apr 2015 10:03:45 +0300 From: Adrian Hunter Organization: Intel Finland Oy, Registered Address: PL 281, 00181 Helsinki, Business Identity Code: 0357606 - 4, Domiciled in Helsinki User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:31.0) Gecko/20100101 Thunderbird/31.5.0 MIME-Version: 1.0 To: Stephane Eranian CC: LKML , Arnaldo Carvalho de Melo , Peter Zijlstra , "mingo@elte.hu" , "ak@linux.intel.com" , Jiri Olsa , Namhyung Kim , Rose Belcher , Sukadev Bhattiprolu , Sonny Rao , John Mccutchan , David Ahern , Pawel Moll Subject: Re: [PATCH v6 3/4] perf inject: add jitdump mmap injection support References: <1427753974-13380-1-git-send-email-eranian@google.com> <1427753974-13380-4-git-send-email-eranian@google.com> <551B9731.7080505@intel.com> <55251BD9.30700@intel.com> <5527C742.3080903@intel.com> In-Reply-To: Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On 13/04/15 03:37, Stephane Eranian wrote: > On Fri, Apr 10, 2015 at 5:51 AM, Adrian Hunter wrote: >> On 08/04/15 17:12, Stephane Eranian wrote: >>> On Wed, Apr 8, 2015 at 5:15 AM, Adrian Hunter wrote: >>>> On 06/04/15 22:41, Stephane Eranian wrote: >>>>> > + if (inject.build_ids) { >>>>> > + /* >>>>> > + * to make sure the mmap records are ordered correctly >>>>> > + * and so that the correct especially due to jitted code >>>>> > + * mmaps. We cannot generate the buildid hit list and >>>>> > + * inject the jit mmaps at the same time for now. >>>>> > + */ >>>>> > + inject.tool.ordered_events = true; >>>>> > + inject.tool.ordering_requires_timestamps = true; >>>>> > + } >>>>> > + >>>>> > + if (inject.jit_mode) { >>>>> > + inject.tool.mmap2 = perf_event__repipe_mmap2; >>>>> > + inject.tool.mmap = perf_event__repipe_mmap; >>>>> >>>>> As suggested above, why not make your own tool fns e.g. >>>>> >>>>> inject.tool.mmap2 = perf_event__jit_mode_mmap2; >>>>> inject.tool.mmap = perf_event__jit_mode_mmap; >>>>> >>>>> >>>>> > + inject.tool.ordered_events = true; >>>>> > + inject.tool.ordering_requires_timestamps = true; >>>>> >>>>> You are taking advantage of a bug in perf-inject, that is the >>>>> "finished_round" events are not being processed. Really they should be >>>>> processed and you should inject in time order. >>>>> >>>>> I am not sure I understand. >>>>> Yes, I am trying to have inject reorder the samples instead of perf report. >>>>> You are likely to run inject once, and report many times. Also avoids a >>>>> warning in report about out-of-order events. >>>> >>>> Well forgetting about "finished_round", it seems to me you need to intercept >>>> all the delivered events (which will be in time order) and inject your own >>>> events at the right time. >>>> >>>> At the moment it seems to me you are injecting all your events in one go >>>> when you see the special jitdump mmap. So I would not expect the injected >>>> events to be ordered with respect to other events that come later. But >>>> maybe I misunderstand that? >>>> >>> My understanding is that if I set ordered_events = true, then events >>> are not delivered immediately to the callbacks. They are queued and >>> sorted and then passed to the callbacks. And yes, there is the finished >>> round mechanism of which I don't quite fully understand the logic in this >>> case. >>> >>>> As I confusingly tried to suggest earlier, one way to see all the >>>> delivered events is to hook the ordered_events "deliver" callback. That >>>> will mean injecting one mmap event at a time. >>>> >>>> Here is just an idea. >>>> >>>> struct perf_inject { >>>> ... >>>> ordered_events__deliver_t deliver; >>>> }; >>>> >>>> int cmd_inject(int argc, const char **argv, const char *prefix __maybe_unused) >>>> { >>>> ... >>>> inject.deliver = inject.session->ordered_events.deliver; >>>> inject.session->ordered_events.deliver = inject_jit_mmap; >>>> ... >>>> } >>>> >>> ok on that. >>> >>>> int inject_jit_mmap(struct ordered_events *oe, struct ordered_event *event) >>>> { >>>> struct perf_session *session = container_of(oe, struct perf_session, ordered_events); >>>> struct perf_inject *inject = container_of(session->tool, struct perf_inject, tool); >>>> >>>> /* Is it time to inject an event */ >>>> if (jit_next_timestamp(inject) < event->timestamp) { >>>> /* Yes, so inject it by delivery */ >>>> perf_session__deliver_synth_event(...); >>>> } >>>> inject->deliver(oe, event); >>>> } >>>> >>> That suggests I have buffered all the MMAPs synthesized from the jitdump and >>> I have some sorted queue based on jitdump timestamps. >> >> Yes, but won't the jitdump timestamps already be in order? >> > I looked at implementing your approach and it turns out to be much > more complicated > than what is already there. > I do not inject anything if I don't have to. I am waiting to see a > special MMAP of the > jitdump file to start inject jit mmaps. That could not change in your > scheme. I would > need to track MMAP records and check if they correspond to a jitdump. Note that > I may have to inject multiple jitdumps in one run of perf inject. In > that case, in the > deliver function, I would have to (1) either have all the jitdump MMAPs prepared > in order for all the jitdumps I have detected or (2) go through all > the jitdump current > head records to see which one matches. In (2) I would have to do this > not just when > I see an MMAP but for any event delivered: for each new event, compare its > timestamp to the timestamps of jitdump current record for ALL jitdump > files. For (1), > I need to build a global list of all jitdump records in for each > jitdump file, i.e, another > RB tree. And then pull them one by one out if I find an event with the > right timestamp. > > I think (2) i more reasonable but requires quite some infrastructure > work in the jitdump > code. Most of the pieces are there, the RB tree needs to be added > obviously. I will > experiment with (2). I do something like (2) for Intel PT, but I use a heap to hold the next lowest timestamp for each "queue" of data that needs to be processed - because I have to allow for the possibility of a large number of "queues". > > >>> The test would have to >>> be more sophisticated that this. You'd want to insert at the right >>> time, i.e., you'd >>> have to track the previous and next timestamp in the stream: >>> >>> retry: >>> j = jit_next_timestamp(inject->jit); >>> if (prev_event->timestamp <= j && j > next_event->timestamp) { >> >> (prev_event->timestamp <= j) must be always true because otherwise you would >> have injected the jit mmap event already. >> >> So perf is about to deliver an event with timestamp 'event->timestamp', so >> you just need to deliver all your events that have not already been >> delivered but have a timestamp < event->timestamp >> >> >>> deliver_synth_event(inject->jit); >>> jit_next(inject->jit); >>> goto retry; >>> } >>> >>> All of this is required because the finished round logic is such that perf >>> cannot sort all records at once. >> >> It can sort them all at once, but finished_round is an optimization that >> starts delivering "early" events before having to sort the "later" events. >> The result is still that the delivered events are in sorted order. >> >>> It does sort them by chunks. Thus I need >>> to make sure I inject in the right chunk otherwise it won't work. Am I reading >>> this right? >> >> If you were to queue the events for sorting, yes. What I suggested was >> delivering your events directly at the "right time". >> >>> >>>> You would also need to inject any remaining events right at the end. >>>> >>>> Note the advantage of "delivering" events is that the same approach works whatever the tool. >>>> >>>> >>> >>> >> > >