From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1750988AbdE2K4P (ORCPT ); Mon, 29 May 2017 06:56:15 -0400 Received: from mga06.intel.com ([134.134.136.31]:56195 "EHLO mga06.intel.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1750893AbdE2K4O (ORCPT ); Mon, 29 May 2017 06:56:14 -0400 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.38,414,1491289200"; d="scan'208";a="1135846241" Subject: Re: [PATCH v2]: perf/core: addressing 4x slowdown during per-process, profiling of STREAM benchmark on Intel Xeon Phi To: Peter Zijlstra Cc: Ingo Molnar , Arnaldo Carvalho de Melo , Alexander Shishkin , Andi Kleen , Kan Liang , Dmitri Prokhorov , Valery Cherepennikov , David Carrillo-Cisneros , Stephane Eranian , Mark Rutland , linux-kernel@vger.kernel.org References: <1e962b59-3e39-e0d6-515d-c4fd3502edae@linux.intel.com> <20170529074636.tjftcdtcg6op74i3@hirez.programming.kicks-ass.net> <75f031d8-68ec-4cd6-752f-1fbecaa86026@linux.intel.com> <20170529104304.vy47zhf6fdq6bki3@hirez.programming.kicks-ass.net> From: Alexey Budankov Organization: Intel Corp. Message-ID: <0e8d266e-ea38-baea-765d-cab98df9b9bc@linux.intel.com> Date: Mon, 29 May 2017 13:56:05 +0300 User-Agent: Mozilla/5.0 (Windows NT 10.0; WOW64; rv:52.0) Gecko/20100101 Thunderbird/52.1.1 MIME-Version: 1.0 In-Reply-To: <20170529104304.vy47zhf6fdq6bki3@hirez.programming.kicks-ass.net> Content-Type: text/plain; charset=utf-8; format=flowed Content-Language: en-US Content-Transfer-Encoding: 7bit Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On 29.05.2017 13:43, Peter Zijlstra wrote: > On Mon, May 29, 2017 at 12:15:14PM +0300, Alexey Budankov wrote: >> On 29.05.2017 10:46, Peter Zijlstra wrote: >>> On Sat, May 27, 2017 at 02:19:51PM +0300, Alexey Budankov wrote: > >>>> @@ -742,7 +772,17 @@ struct perf_event_context { >>>> >>>> struct list_head active_ctx_list; >>>> struct list_head pinned_groups; >>>> + /* >>>> + * Cpu tree for pinned groups; keeps event's group_node nodes >>>> + * of attached flexible groups; >>>> + */ >>>> + struct rb_root pinned_tree; >>>> struct list_head flexible_groups; >>>> + /* >>>> + * Cpu tree for flexible groups; keeps event's group_node nodes >>>> + * of attached flexible groups; >>>> + */ >>>> + struct rb_root flexible_tree; >>>> struct list_head event_list; >>>> int nr_events; >>>> int nr_active; >>>> @@ -758,6 +798,7 @@ struct perf_event_context { >>>> */ >>>> u64 time; >>>> u64 timestamp; >>>> + struct perf_event_tstamp tstamp_data; >>>> >>>> /* >>>> * These fields let us detect when two contexts have both >>> >>> >>> So why do we now have a list _and_ a tree for the same entries? > >> We need groups list to iterate through all groups configured for collection >> and we need the tree to quickly iterate through the groups allocated for a >> particular CPU only. > > *confused*, what? > > Why can't the tree do both? > Well, indeed, the tree provides such capability too. However switching to the full tree iteration in cases where we now go through _groups lists will enlarge the patch, what is probably is not a big deal. Do you think it is worth implementing the switch? Thanks, Alexey