From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1751138AbdFTRK1 (ORCPT ); Tue, 20 Jun 2017 13:10:27 -0400 Received: from mga14.intel.com ([192.55.52.115]:3437 "EHLO mga14.intel.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1750823AbdFTRK0 (ORCPT ); Tue, 20 Jun 2017 13:10:26 -0400 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.39,364,1493708400"; d="scan'208";a="99817997" Subject: Re: [PATCH v3 1/n] perf/core: addressing 4x slowdown during per-process profiling of STREAM benchmark on Intel Xeon Phi To: Mark Rutland Cc: Peter Zijlstra , Ingo Molnar , Arnaldo Carvalho de Melo , Alexander Shishkin , Andi Kleen , Kan Liang , Dmitri Prokhorov , Valery Cherepennikov , David Carrillo-Cisneros , Stephane Eranian , linux-kernel@vger.kernel.org References: <09226446-39b9-9bd2-d60f-b9bb947987c5@linux.intel.com> <20170615195618.GA8807@leverpostej> <162b1822-06b7-30e0-073c-911ba99b33b7@linux.intel.com> <20170620133624.GF28157@leverpostej> <20170620163728.GC26710@leverpostej> From: Alexey Budankov Organization: Intel Corp. Message-ID: <4611984d-0d77-1584-3011-768c80c261af@linux.intel.com> Date: Tue, 20 Jun 2017 20:10:06 +0300 User-Agent: Mozilla/5.0 (Windows NT 10.0; WOW64; rv:52.0) Gecko/20100101 Thunderbird/52.1.1 MIME-Version: 1.0 In-Reply-To: <20170620163728.GC26710@leverpostej> Content-Type: text/plain; charset=utf-8; format=flowed Content-Language: en-US Content-Transfer-Encoding: 7bit Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On 20.06.2017 19:37, Mark Rutland wrote: > On Tue, Jun 20, 2017 at 06:22:56PM +0300, Alexey Budankov wrote: >> On 20.06.2017 16:36, Mark Rutland wrote: >>> On Mon, Jun 19, 2017 at 11:31:59PM +0300, Alexey Budankov wrote: >>>> On 15.06.2017 22:56, Mark Rutland wrote: >>>>> On Thu, Jun 15, 2017 at 08:41:42PM +0300, Alexey Budankov wrote: >>>>>> +static int >>>>>> +perf_cpu_tree_iterate(struct rb_root *tree, >>>>>> + perf_cpu_tree_callback_t callback, void *data) >>>>>> +{ >>>>>> + int ret = 0; >>>>>> + struct rb_node *node; >>>>>> + struct perf_event *event; >>>>>> + >>>>>> + WARN_ON_ONCE(!tree); >>>>>> + >>>>>> + for (node = rb_first(tree); node; node = rb_next(node)) { >>>>>> + struct perf_event *node_event = container_of(node, >>>>>> + struct perf_event, group_node); >>>>>> + >>>>>> + list_for_each_entry(event, &node_event->group_list, >>>>>> + group_list_entry) { >>>>>> + ret = callback(event, data); >>>>>> + if (ret) >>>>>> + return ret; >>>>>> + } >>>>>> + } >>>>>> + >>>>>> + return 0; >>>>>> } >>>>> >>>>> If you need to iterate over every event, you can use the list that >>>>> threads the whole tree. >>>> >>>> Could you please explain more on that? >>> >>> In Peter's original suggestion, we'd use a threaded tree rather than a >>> tree of lists. >>> >>> i.e. you'd have something like: >>> >>> struct threaded_rb_node { >>> struct rb_node node; >>> struct list_head head; >>> }; >> >> Is this for every group leader? > > Yes; *every* group leader would be directly in the threaded rb tree. In this case the tree's key heeds to be something trickier than just event->cpu. To avoid that complication group_list is introduced. BTW, addressing perf_event_tree_delete issue doesn't look like a big change now: static void perf_cpu_tree_delete(struct rb_root *tree, struct perf_event *event) { struct perf_event *next; WARN_ON_ONCE(!tree || !event); list_del_init(&event->group_entry); if (!RB_EMPTY_NODE(&event->group_node)) { if (!list_empty(&event->group_list)) { next = list_first_entry(&event->group_list, struct perf_event, group_entry); list_replace_init(&event->group_list, &next->group_list); rb_replace_node(&event->group_node, &next->group_node, tree); } else { rb_erase(&event->group_node, tree); } RB_CLEAR_NODE(&event->group_node); } } > >> Which objects does the head keep? > > Sorry, I'm not sure how to answer that. Did the above clarify? > > If not, could you rephrase the question? > > Thanks, > Mark. >