From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <linux-kernel-owner@vger.kernel.org>
Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand
	id S1752520AbcGBQkc (ORCPT <rfc822;w@1wt.eu>);
	Sat, 2 Jul 2016 12:40:32 -0400
Received: from bombadil.infradead.org ([198.137.202.9]:38676 "EHLO
	bombadil.infradead.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org
	with ESMTP id S1752261AbcGBQka (ORCPT
	<rfc822;linux-kernel@vger.kernel.org>);
	Sat, 2 Jul 2016 12:40:30 -0400
Date: Sat, 2 Jul 2016 18:40:25 +0200
From: Peter Zijlstra <peterz@infradead.org>
To: Mark Rutland <mark.rutland@arm.com>
Cc: linux-kernel@vger.kernel.org,
        Alexander Shishkin <alexander.shishkin@linux.intel.com>,
        Arnaldo Carvalho de Melo <acme@kernel.org>,
        Ingo Molnar <mingo@redhat.com>, Will Deacon <will.deacon@arm.com>
Subject: Re: [PATCH] perf: fix pmu::filter_match for SW-led groups
Message-ID: <20160702164025.GU30921@twins.programming.kicks-ass.net>
References: <1465917041-15339-1-git-send-email-mark.rutland@arm.com>
MIME-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
In-Reply-To: <1465917041-15339-1-git-send-email-mark.rutland@arm.com>
User-Agent: Mutt/1.5.23.1 (2014-03-12)
Sender: linux-kernel-owner@vger.kernel.org
List-ID: <linux-kernel.vger.kernel.org>
X-Mailing-List: linux-kernel@vger.kernel.org

On Tue, Jun 14, 2016 at 04:10:41PM +0100, Mark Rutland wrote:
> However, pmu::filter_match is only called for the leader of each event
> group. When the leader is a SW event, we do not filter the groups, and
> may fail at pmu::add time, and when this happens we'll give up on
> scheduling any event groups later in the list until they are rotated
> ahead of the failing group.

Ha! indeed.

> I've tried to find a better way of handling this (without needing to walk the
> siblings list), but so far I'm at a loss. At least it's "only" O(n) in the size
> of the sibling list we were going to walk anyway.
> 
> I suspect that at a more fundamental level, I need to stop sharing a
> perf_hw_context between HW PMUs (i.e. replace task_struct::perf_event_ctxp with
> something that can handle multiple HW PMUs). From previous attempts I'm not
> sure if that's going to be possible.
> 
> Any ideas appreciated!

So I think I have half-cooked ideas.

One of the problems I've been wanting to solve for a long time is that
the per-cpu flexible list has priority over the per-task flexible list.

I would like them to rotate together.

One of the ways I was looking at getting that done is a virtual runtime
scheduler (just like cfs). The tricky point is merging two virtual
runtime trees. But I think that should be doable if we sort the trees on
lag.

In any case, the relevance to your question is that once we have a tree,
we can play games with order; that is, if we first order on PMU-id and
only second on lag, we get whole subtree clusters specific for a PMU.


Lost of details missing in that picture, but I think something along
those lines might get us what we want.