From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1754628AbaKEMeI (ORCPT ); Wed, 5 Nov 2014 07:34:08 -0500 Received: from casper.infradead.org ([85.118.1.10]:44898 "EHLO casper.infradead.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1753888AbaKEMeF (ORCPT ); Wed, 5 Nov 2014 07:34:05 -0500 Date: Wed, 5 Nov 2014 13:33:54 +0100 From: Peter Zijlstra To: Robert Bragg Cc: linux-kernel@vger.kernel.org, Paul Mackerras , Ingo Molnar , Arnaldo Carvalho de Melo , Daniel Vetter , Chris Wilson , Rob Clark , Samuel Pitoiset , Ben Skeggs Subject: Re: [RFC PATCH 0/3] Expose gpu counters via perf pmu driver Message-ID: <20141105123354.GR3337@twins.programming.kicks-ass.net> References: <1413991731-20628-1-git-send-email-robert@sixbynine.org> <20141030190841.GI23531@worktop.programming.kicks-ass.net> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: User-Agent: Mutt/1.5.21 (2012-12-30) Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Mon, Nov 03, 2014 at 09:47:17PM +0000, Robert Bragg wrote: > > And do I take it right that if you're able/allowed/etc.. to open/have > > the fd to the GPU/DRM/DRI whatever context you have the right > > credentials to also observe these counters? > > Right and in particular since we want to allow OpenGL clients to be > able the profile their own gpu context with out any special privileges > my current pmu driver accepts a device file descriptor via config1 + a > context id via attr->config, both for checking credentials and > uniquely identifying which context should be profiled. (A single > client can open multiple contexts via one drm fd) Ah interesting. So we've got fd+context_id+event_id to identify any one number provided by the GPU. > That said though; when running as root it is not currently a > requirement to pass any fd when configuring an event to profile across > all gpu contexts. I'm just mentioning this because although I think it > should be ok for us to use an fd to determine credentials and help > specify a gpu context, an fd might not be necessary for system wide > profiling cases. Hmm, how does root know what context_id to provide? Are those exposed somewhere? Is there also a root context, one that encompasses all others? > >> Conceptually I suppose we want to be able to open an event that's not > >> associated with any cpu or process, but to keep things simple and fit > >> with perf's current design, the pmu I have a.t.m expects an event to be > >> opened for a specific cpu and unspecified process. > > > > There are no actual scheduling ramifications right? Let me ponder his > > for a little while more.. > > Ok, I can't say I'm familiar enough with the core perf infrastructure > to entirely sure about this. Yeah, so I don't think so. Its on the device, nothing the CPU/scheduler does affects what the device does. > I recall looking at how some of the uncore perf drivers were working > and it looked like they had a similar issue where conceptually the pmu > doesn't belong to a specific cpu and so the id would internally get > mapped to some package state, shared by multiple cpus. Yeah, we could try and map these devices to a cpu on their node -- PCI devices are node local. But I'm not sure we need to start out by doing that. > My understanding had been that being associated with a specific cpu > did have the side effect that most of the pmu methods for that event > would then be invoked on that cpu through inter-process interrupts. At > one point that had seemed slightly problematic because there weren't > many places within my pmu driver where I could assume I was in process > context and could sleep. This was a problem with an earlier version > because the way I read registers had a slim chance of needing to sleep > waiting for the gpu to come out of RC6, but isn't a problem any more. Right, so I suppose we could make a new global context for these device like things and avoid some that song and dance. But we can do that later. > One thing that does come to mind here though is that I am overloading > pmu->read() as a mechanism for userspace to trigger a flush of all > counter snapshots currently in the gpu circular buffer to userspace as > perf events. Perhaps it would be best if that work (which might be > relatively costly at times) were done in the context of the process > issuing the flush(), instead of under an IPI (assuming that has some > effect on scheduler accounting). Right, so given you tell the GPU to periodically dump these stats (per context I presume), you can at a similar interval schedule whatever to flush this and update the relevant event->count values and have an NO-OP pmu::read() method. If the GPU provides interrupts to notify you of new data or whatnot, you can make that drive the thing.