From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <linux-kernel-owner@vger.kernel.org>
Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand
        id S933170AbcL0XK7 (ORCPT <rfc822;w@1wt.eu>);
        Tue, 27 Dec 2016 18:10:59 -0500
Received: from one.firstfloor.org ([193.170.194.197]:51609 "EHLO
        one.firstfloor.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org
        with ESMTP id S1754312AbcL0XKw (ORCPT
        <rfc822;linux-kernel@vger.kernel.org>);
        Tue, 27 Dec 2016 18:10:52 -0500
Date: Tue, 27 Dec 2016 15:10:49 -0800
From: Andi Kleen <andi@firstfloor.org>
To: David Carrillo-Cisneros <davidcc@google.com>
Cc: Andi Kleen <andi@firstfloor.org>,
        Shivappa Vikas <vikas.shivappa@intel.com>,
        Peter Zijlstra <peterz@infradead.org>,
        Vikas Shivappa <vikas.shivappa@linux.intel.com>,
        linux-kernel <linux-kernel@vger.kernel.org>, x86 <x86@kernel.org>,
        Thomas Gleixner <tglx@linutronix.de>,
        "Shankar, Ravi V" <ravi.v.shankar@intel.com>,
        "Luck, Tony" <tony.luck@intel.com>, Fenghua Yu <fenghua.yu@intel.com>,
        Stephane Eranian <eranian@google.com>, hpa@zytor.com
Subject: Re: [PATCH 01/14] x86/cqm: Intel Resource Monitoring Documentation
Message-ID: <20161227231049.GT26852@two.firstfloor.org>
References: <1481929988-31569-1-git-send-email-vikas.shivappa@linux.intel.com>
 <1481929988-31569-2-git-send-email-vikas.shivappa@linux.intel.com>
 <20161223123228.GQ3107@twins.programming.kicks-ass.net>
 <alpine.DEB.2.10.1612231126590.32409@vshiva-Udesk>
 <20161223203318.GU3107@twins.programming.kicks-ass.net>
 <alpine.DEB.2.10.1612241747170.32409@vshiva-Udesk>
 <87vau5gn1w.fsf@firstfloor.org>
 <CALcN6mjDo-0bqZb1ePrWXkAhVbe2TBAvadM3xqEyscovZCQoWQ@mail.gmail.com>
MIME-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
In-Reply-To: <CALcN6mjDo-0bqZb1ePrWXkAhVbe2TBAvadM3xqEyscovZCQoWQ@mail.gmail.com>
User-Agent: Mutt/1.5.21 (2010-09-15)
Sender: linux-kernel-owner@vger.kernel.org
List-ID: <linux-kernel.vger.kernel.org>
X-Mailing-List: linux-kernel@vger.kernel.org

On Tue, Dec 27, 2016 at 01:33:46PM -0800, David Carrillo-Cisneros wrote:
> When using one intel_cmt/llc_occupancy/ cgroup perf_event in one CPU, the
> avg time to do __perf_event_task_sched_out + __perf_event_task_sched_in is
> ~1170ns
> 
> most of the time is spend in cgroup ctx switch (~1120ns) .
> 
> When using continuous monitoring in CQM driver, the avg time to
> find the rmid to write inside of pqr_context switch  is ~16ns
> 
> Note that this excludes the MSR write. It's only the overhead of
> finding the RMID
> to write in PQR_ASSOC. Both paths call the same routine to find the
> RMID, so there are
> about 1100 ns of overhead in perf_cgroup_switch. By inspection I assume most
> of it comes from iterating over the pmu list.

Do Kan's pmu list patches help? 

https://patchwork.kernel.org/patch/9420035/

> 
> > Or is there some other overhead other than the MSR write
> > you're concerned about?
> 
> No, that problem is solved with the PQR software cache introduced in the series.

So it's already fixed?

How much is the cost with your cache?

> 
> 
> > Perhaps some optimization could be done in the code to make it faster,
> > then the new interface wouldn't be needed.
> 
> There are some. One in my list is to create a list of pmus with at
> least one cgroup event
> and use it to iterate over in perf_cgroup_switch, instead of using the
> "pmus" list.
> The pmus list has grown a lot recently with the addition of all the uncore pmus.

Kan's patches above already do that I believe.

> 
> Despite this optimization, it's unlikely that the whole sched_out +
> sched_in gets that
> close to the 15 ns of the non perf_event approach.

It would be good to see how close we can get. I assume
there is more potential for optimizations and fast pathing.

-Andi