From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1753799AbcGDSFp (ORCPT ); Mon, 4 Jul 2016 14:05:45 -0400 Received: from foss.arm.com ([217.140.101.70]:49147 "EHLO foss.arm.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1750734AbcGDSFo (ORCPT ); Mon, 4 Jul 2016 14:05:44 -0400 Date: Mon, 4 Jul 2016 19:05:35 +0100 From: Mark Rutland To: Peter Zijlstra Cc: linux-kernel@vger.kernel.org, Alexander Shishkin , Arnaldo Carvalho de Melo , Ingo Molnar , Will Deacon Subject: Re: [PATCH] perf: fix pmu::filter_match for SW-led groups Message-ID: <20160704180534.GD9048@leverpostej> References: <1465917041-15339-1-git-send-email-mark.rutland@arm.com> <20160702164025.GU30921@twins.programming.kicks-ass.net> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20160702164025.GU30921@twins.programming.kicks-ass.net> User-Agent: Mutt/1.5.21 (2010-09-15) Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Sat, Jul 02, 2016 at 06:40:25PM +0200, Peter Zijlstra wrote: > On Tue, Jun 14, 2016 at 04:10:41PM +0100, Mark Rutland wrote: > > However, pmu::filter_match is only called for the leader of each event > > group. When the leader is a SW event, we do not filter the groups, and > > may fail at pmu::add time, and when this happens we'll give up on > > scheduling any event groups later in the list until they are rotated > > ahead of the failing group. > > Ha! indeed. > > > I've tried to find a better way of handling this (without needing to walk the > > siblings list), but so far I'm at a loss. At least it's "only" O(n) in the size > > of the sibling list we were going to walk anyway. > > > > I suspect that at a more fundamental level, I need to stop sharing a > > perf_hw_context between HW PMUs (i.e. replace task_struct::perf_event_ctxp with > > something that can handle multiple HW PMUs). From previous attempts I'm not > > sure if that's going to be possible. > > > > Any ideas appreciated! > > So I think I have half-cooked ideas. > > One of the problems I've been wanting to solve for a long time is that > the per-cpu flexible list has priority over the per-task flexible list. > > I would like them to rotate together. Makes sense. > One of the ways I was looking at getting that done is a virtual runtime > scheduler (just like cfs). The tricky point is merging two virtual > runtime trees. But I think that should be doable if we sort the trees on > lag. > > In any case, the relevance to your question is that once we have a tree, > we can play games with order; that is, if we first order on PMU-id and > only second on lag, we get whole subtree clusters specific for a PMU. Hmm... I'm not sure how that helps in this case. Wouldn't we still need to walk the sibling list to get the HW PMU-id in the case of a SW group leader? For the heterogeenous case we'd need a different sort order per-cpu (well, per microarchitecture), which sounds like we're going to have to fully sort the events every time they move between CPUs. :/ > Lost of details missing in that picture, but I think something along > those lines might get us what we want. Perhaps! Hopefully I'm just missing those detail above. :) I also had another though about solving the SW-led group case: if the leader had a reference to the group's HW PMU (of which there should only be one), we can filter on that alone, and can also use that in group_sched_in rather than the ctx->pmu, avoiding the issue that ctx->pmu is not the same as the group's HW PMU. I'll have a play with that approach in the mean time. Thanks, Mark.