From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-2.4 required=3.0 tests=DKIM_SIGNED, HEADER_FROM_DIFFERENT_DOMAINS,MAILING_LIST_MULTI,SPF_PASS,T_DKIM_INVALID, USER_AGENT_MUTT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 2D642C4646D for ; Wed, 8 Aug 2018 07:52:02 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.kernel.org (Postfix) with ESMTP id CC04A21719 for ; Wed, 8 Aug 2018 07:52:01 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=fail reason="signature verification failed" (2048-bit key) header.d=infradead.org header.i=@infradead.org header.b="dx44y9b4" DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org CC04A21719 Authentication-Results: mail.kernel.org; dmarc=none (p=none dis=none) header.from=infradead.org Authentication-Results: mail.kernel.org; spf=none smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1727102AbeHHKK2 (ORCPT ); Wed, 8 Aug 2018 06:10:28 -0400 Received: from bombadil.infradead.org ([198.137.202.133]:39902 "EHLO bombadil.infradead.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726902AbeHHKK2 (ORCPT ); Wed, 8 Aug 2018 06:10:28 -0400 DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=infradead.org; s=bombadil.20170209; h=In-Reply-To:Content-Type:MIME-Version :References:Message-ID:Subject:Cc:To:From:Date:Sender:Reply-To: Content-Transfer-Encoding:Content-ID:Content-Description:Resent-Date: Resent-From:Resent-Sender:Resent-To:Resent-Cc:Resent-Message-ID:List-Id: List-Help:List-Unsubscribe:List-Subscribe:List-Post:List-Owner:List-Archive; bh=/q2/4T+CGaQK9BqUsgCCY2Twi01mcXQ6Qjq7DyFqy9M=; b=dx44y9b45Hz73ZJjRfMSfwrqH VExQfvj/SmRkx3ybp41RA163Lu9uOhw1vWiUhuQ/0cHryRMAYDgzeJpf06vRhuSeEaUTOFxq7aoti 2d9nD5DWBqK9DfUxo1Un7JSw2s37TA/SRJtSXIeZDEvmKUrRLkWfs5ARYWGMpLTBlQzfBUBFbnFIS udZppYurv990ChLw8vZCfByWqmA2joIcq1Do6dfS9VSi9EmP9hLxSRpFnMcffl13JFlRABKNmdiXa i2lH1FGMCADuPXzVb1So7LrM6fd2KkRje4ijijOF6N62yGaICD1WsDS/FhWLSIqBWwE+4EnXac5mt YP7HjelVw==; Received: from j217100.upc-j.chello.nl ([24.132.217.100] helo=hirez.programming.kicks-ass.net) by bombadil.infradead.org with esmtpsa (Exim 4.90_1 #2 (Red Hat Linux)) id 1fnJG8-0000ZV-Fu; Wed, 08 Aug 2018 07:51:56 +0000 Received: by hirez.programming.kicks-ass.net (Postfix, from userid 1000) id ED9F320163EBC; Wed, 8 Aug 2018 09:51:54 +0200 (CEST) Date: Wed, 8 Aug 2018 09:51:54 +0200 From: Peter Zijlstra To: Reinette Chatre Cc: Dave Hansen , tglx@linutronix.de, mingo@redhat.com, fenghua.yu@intel.com, tony.luck@intel.com, vikas.shivappa@linux.intel.com, gavin.hindman@intel.com, jithu.joseph@intel.com, hpa@zytor.com, x86@kernel.org, linux-kernel@vger.kernel.org Subject: Re: [PATCH 0/2] x86/intel_rdt and perf/x86: Fix lack of coordination with perf Message-ID: <20180808075154.GN2494@hirez.programming.kicks-ass.net> References: <086b93f5-da5b-b5e5-148a-cef25117b963@intel.com> <20180803104956.GU2494@hirez.programming.kicks-ass.net> <1eece033-fbae-c904-13ad-1904be91c049@intel.com> <20180803152523.GY2476@hirez.programming.kicks-ass.net> <57c011e1-113d-c38f-c318-defbad085843@intel.com> <20180806221225.GO2458@hirez.programming.kicks-ass.net> <08d51131-7802-5bfe-2cae-d116807183d1@intel.com> <20180807093615.GY2494@hirez.programming.kicks-ass.net> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: User-Agent: Mutt/1.10.0 (2018-05-17) Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Tue, Aug 07, 2018 at 03:47:15PM -0700, Reinette Chatre wrote: > > FWIW, how long is that IRQ disabled section? It looks like something > > that could be taking a bit of time. We have these people that care about > > IRQ latency. > > We work closely with customers needing low latency as well as customers > needing deterministic behavior. > > This measurement is triggered by the user as a validation mechanism of > the pseudo-locked memory region after it has been created as part of > system setup as well as during runtime if there are any concerns with > the performance of an application that uses it. > > This measurement would thus be triggered before the sensitive workloads > start - during system setup, or if an issue is already present. In > either case the measurement is triggered by the administrator via debugfs. That does not in fact include the answer to the question. Also, it assumes a competent operator (something I've found is not always true). > > - I don't much fancy people accessing the guts of events like that; > > would not an inline function like: > > > > static inline u64 x86_perf_rdpmc(struct perf_event *event) > > { > > u64 val; > > > > lockdep_assert_irqs_disabled(); > > > > rdpmcl(event->hw.event_base_rdpmc, val); > > return val; > > } > > > > Work for you? > > No. This does not provide accurate results. Implementing the above produces: > pseudo_lock_mea-366 [002] .... 34.950740: pseudo_lock_l2: hits=4096 > miss=4 But it being an inline function should allow the compiler to optimize and lift the event->hw.event_base_rdpmc load like you now do manually. Also, like Tony already suggested, you can prime that load just fine by doing an extra invocation. (and note that the above function is _much_ simpler than perf_event_read_local()) > > - native_read_pmc(); are you 100% sure this code only ever runs on > > native and not in some dodgy virt environment? > > My understanding is that a virtual environment would be a customer of a > RDT allocation (cache or memory bandwidth). I do not see if/where this > is restricted though - I'll move to rdpmcl() but the usage of a cache > allocation feature like this from a virtual machine needs more > investigation. I can imagine that hypervisors that allow physical partitioning could allow delegating the rdt crud to their guests when they 'own' a full socket or whatever the domain is for this. > Will do. I created the following helper function that can be used after > interrupts are disabled: > > static inline int perf_event_error_state(struct perf_event *event) > { > int ret = 0; > u64 tmp; > > ret = perf_event_read_local(event, &tmp, NULL, NULL); > if (ret < 0) > return ret; > > if (event->attr.pinned && event->oncpu != smp_processor_id()) > return -EBUSY; > > return ret; > } Nah, stick the test in perf_event_read_local(), that actually needs it. > > Also, while you disable IRQs, your fancy pants loop is still subject to > > NMIs that can/will perturb your measurements, how do you deal with > > those? > Customers interested in this feature are familiar with dealing with them > (and also SMIs). The user space counterpart is able to detect such an > occurrence. You're very optimistic about your customers capabilities. And this might be true for the current people you're talking to, but once this is available and public, joe monkey will have access and he _will_ screw it up. > Please note that if an NMI arrives it would be handled with the > currently active cache capacity bitmask so none of the pseudo-locked > memory will be evicted since no capacity bitmask overlaps with the > pseudo-locked region. So exceptions change / have their own bitmask?