From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-4.0 required=3.0 tests=BAYES_00,MAILING_LIST_MULTI, SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED autolearn=no autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id BCFB4C47089 for ; Thu, 27 May 2021 11:17:45 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 9E620613D1 for ; Thu, 27 May 2021 11:17:45 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S229640AbhE0LTN (ORCPT ); Thu, 27 May 2021 07:19:13 -0400 Received: from mail.kernel.org ([198.145.29.99]:43012 "EHLO mail.kernel.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229893AbhE0LTH (ORCPT ); Thu, 27 May 2021 07:19:07 -0400 Received: from disco-boy.misterjones.org (disco-boy.misterjones.org [51.254.78.96]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPSA id 4AD05610A6; Thu, 27 May 2021 11:17:34 +0000 (UTC) Received: from 78.163-31-62.static.virginmediabusiness.co.uk ([62.31.163.78] helo=why.misterjones.org) by disco-boy.misterjones.org with esmtpsa (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.94.2) (envelope-from ) id 1lmE16-003vcg-9c; Thu, 27 May 2021 12:17:32 +0100 Date: Thu, 27 May 2021 12:17:31 +0100 Message-ID: <87zgwgs9x0.wl-maz@kernel.org> From: Marc Zyngier To: Valentin Schneider Cc: linux-kernel@vger.kernel.org, linux-arm-kernel@lists.infradead.org, Thomas Gleixner , Lorenzo Pieralisi , Vincenzo Frascino Subject: Re: [RFC PATCH v2 00/10] irqchip/irq-gic: Optimize masking by leveraging EOImode=1 In-Reply-To: <20210525173255.620606-1-valentin.schneider@arm.com> References: <20210525173255.620606-1-valentin.schneider@arm.com> User-Agent: Wanderlust/2.15.9 (Almost Unreal) SEMI-EPG/1.14.7 (Harue) FLIM-LB/1.14.9 (=?UTF-8?B?R29qxY0=?=) APEL-LB/10.8 EasyPG/1.0.0 Emacs/27.1 (x86_64-pc-linux-gnu) MULE/6.0 (HANACHIRUSATO) MIME-Version: 1.0 (generated by SEMI-EPG 1.14.7 - "Harue") Content-Type: text/plain; charset=US-ASCII X-SA-Exim-Connect-IP: 62.31.163.78 X-SA-Exim-Rcpt-To: valentin.schneider@arm.com, linux-kernel@vger.kernel.org, linux-arm-kernel@lists.infradead.org, tglx@linutronix.de, lorenzo.pieralisi@arm.com, vincenzo.frascino@arm.com X-SA-Exim-Mail-From: maz@kernel.org X-SA-Exim-Scanned: No (on disco-boy.misterjones.org); SAEximRunCond expanded to false Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Tue, 25 May 2021 18:32:45 +0100, Valentin Schneider wrote: > > Hi folks! > > This is the spiritual successor to [1], which was over 6 years ago (!). > > Revisions > ========= > > RFCv1 -> RFCv2 > ++++++++++++++ > > o Rebased against latest tip/irq/core > o Applied cleanups suggested by Thomas > > o Collected some performance results > > Background > ========== > > GIC mechanics > +++++++++++++ > > There are three IRQ operations: > o Acknowledge. This gives us the IRQ number that interrupted us, and also > - raises the running priority of the CPU interface to that of the IRQ > - sets the active bit of the IRQ > o Priority Drop. This "clears" the running priority. > o Deactivate. This clears the active bit of the IRQ. > > o The CPU interface has a running priority value. No interrupt of lower or > equal priority will be signaled to the CPU attached to that interface. On > Linux, we only have two priority values: pNMIs at highest priority, and > everything else at the other priority. > o Most GIC interrupts have an "active" bit. This bit is set on Acknowledge > and cleared on Deactivate. A given interrupt cannot be re-signaled to a > CPU if it has its active bit set (i.e. if it "fires" again while it's > being handled). > > EOImode fun > +++++++++++ > > In EOImode=0, Priority Drop and Deactivate are undissociable. The > (simplified) interrupt handling flow is as follows: > > <~IRQ> > Acknowledge > Priority Drop + Deactivate > > > With EOImode=1, we can invoke each operation individually. This gives us: > > <~IRQ> > Acknowledge > Priority Drop > <*other* interrupts can be signaled from here, once interrupts are re-enabled> > Deactivate > <*this* interrupt can be signaled again> > > What this means is that with EOImode=1, any interrupt is kept "masked" by > its active bit between Priority Drop and Deactivate. > > Threaded IRQs and ONESHOT > ========================= > > ONESHOT threaded IRQs must remain masked between the main handler and the > threaded handler. Right now we do this using the conventional irq_mask() > operations, which looks like this: > > > Acknowledge > Priority Drop > irq_mask() > Deactivate > > > irq_unmask() > > However, masking for the GICs means poking the distributor, and there's no > sysreg for that - it's an MMIO access. We've seen above that our IRQ > handling can give us masking "for free", and this is what this patch set is > all about. It turns the above handling into: > > > Acknowledge > Priority Drop > > > Deactivate > > No irq_mask() => fewer MMIO accesses => happier users (or so I've been > told). This is especially relevant to PREEMPT_RT which forces threaded > IRQs. > > Functional testing > ================== > > GICv2 > +++++ > > I've tested this on my Juno with forced irqthreads. This makes the pl011 > IRQ into a threaded ONESHOT IRQ, so I spammed my keyboard into the console > and verified via ftrace that there were no irq_mask() / irq_unmask() > involved. > > GICv3 > +++++ > > I've tested this on my Ampere eMAG, which uncovered "fun" interactions with > the MSI domains. Did the same trick as the Juno with the pl011. > > pNMIs cause said eMAG to freeze, but that's true even without my patches. I > did try them out under QEMU+KVM and that looked fine, although that means I > only got to test EOImode=0. I'll try to dig into this when I get some more > cycles. That's interesting/worrying. As far as I remember, this machine uses GIC500, which is a well known quantity. If pNMIs are causing issues, that'd probably be a CPU interface problem. Can you elaborate on how you tried to test that part? Just using the below benchmark? > > Performance impact > ================== > > Benchmark > +++++++++ > > Finding a benchmark that leverages a force-threaded IRQ has proved to be > somewhat of a pain, so I crafted my own. It's a bit daft, but so are most > benchmarks (though this one might win a prize). I love it (and wrote similar hacks in my time)! :D Can you put that up somewhere so that I can run the same test on my own zoo and find out how it fares? > > Long story short, I'm picking an unused IRQ and have it be > force-threaded. The benchmark then is: > > > loop: > irq_set_irqchip_state(irq, IRQCHIP_STATE_PENDING, true); > wait_for_completion(&done); > > > complete(&done); > > A more complete picture would be: > > > raise IRQ > wait > run flow handler > wake IRQ thread > finish handling > wake bench thread > > Letting this run for a fixed amount of time lets me measure an entire IRQ > handling cycle, which is what I'm after since there's one less mask() in > the flow handler and one less unmask() in the threaded handler. > > You'll note there's some potential "noise" in there due to scheduling both > the benchmark thread and the IRQ thread. However, the IRQ thread is pinned > to the IRQ's affinity, and I also pinned the benchmark thread in my tests, > which should keep this noise to a minimum. > > Results > +++++++ > > On a Juno r0, 20 iterations of 5 seconds of that benchmark yields > (measuring irqs/sec): > > | mean | median | 90th percentile | 99th percentile | > |------+--------+-----------------+-----------------| > | +11% | +11% | +12% | +14% | > > On an Ampere eMAG, 20 iterations of 5 seconds of that benchmark yields > (measuring irqs/sec): > > | mean | median | 90th percentile | 99th percentile | > |------+--------+-----------------+-----------------| > | +20% | +20% | +20% | +20% | > > This is still quite "artificial", but it reassures me in that skipping those > (un)mask operations can yield some measurable improvement. 20% improvement is even higher than I suspected! Thanks, M. -- Without deviation from the norm, progress is not possible. From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-4.4 required=3.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS, URIBL_BLOCKED autolearn=no autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 65EB5C4707F for ; Thu, 27 May 2021 11:19:29 +0000 (UTC) Received: from bombadil.infradead.org (bombadil.infradead.org [198.137.202.133]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id 1CB7361248 for ; Thu, 27 May 2021 11:19:29 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 1CB7361248 Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=kernel.org Authentication-Results: mail.kernel.org; spf=none smtp.mailfrom=linux-arm-kernel-bounces+linux-arm-kernel=archiver.kernel.org@lists.infradead.org DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=lists.infradead.org; s=bombadil.20210309; h=Sender: Content-Transfer-Encoding:Content-Type:List-Subscribe:List-Help:List-Post: List-Archive:List-Unsubscribe:List-Id:MIME-Version:References:In-Reply-To: Subject:Cc:To:From:Message-ID:Date:Reply-To:Content-ID:Content-Description: Resent-Date:Resent-From:Resent-Sender:Resent-To:Resent-Cc:Resent-Message-ID: List-Owner; bh=fD8Mn6AYZUpwFHLJqEWwV64Ezl2a/UllBDnSrdiOv6E=; b=jA1MM7Wg1og9Jf sItzuS1zW3qwAUpXwd0QcBEEbc1JHE1aA7SUOTxaRw3EwfKct7Ji/ItXgQeFeVFlsRL+znvDnG4WZ aOoT9jO9174dU9e8M7TXqzFAPx7v7LLvmDFlltXpS3RoVaTePQhSh4NX2PSA8XwERNrFXZT2n5rnI iu/ZIV0cv20ckYeH782NEtc6odkArgerirtf1lxta433lob0oIU/UV3nuIUE9icWJtwy4hNvyGN49 YfoD0KKhGRosfJdbS7Afl+ITC+DtOLxRqAlBzWxY+OHwbBiDPxHKjKdVWNJhgkjJ7WbQzDncq05g+ SLiog7/csOtgWALqvo6g==; Received: from localhost ([::1] helo=bombadil.infradead.org) by bombadil.infradead.org with esmtp (Exim 4.94.2 #2 (Red Hat Linux)) id 1lmE1O-005BeO-B5; Thu, 27 May 2021 11:17:50 +0000 Received: from mail.kernel.org ([198.145.29.99]) by bombadil.infradead.org with esmtps (Exim 4.94.2 #2 (Red Hat Linux)) id 1lmE18-005BVK-QH for linux-arm-kernel@lists.infradead.org; Thu, 27 May 2021 11:17:36 +0000 Received: from disco-boy.misterjones.org (disco-boy.misterjones.org [51.254.78.96]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPSA id 4AD05610A6; Thu, 27 May 2021 11:17:34 +0000 (UTC) Received: from 78.163-31-62.static.virginmediabusiness.co.uk ([62.31.163.78] helo=why.misterjones.org) by disco-boy.misterjones.org with esmtpsa (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.94.2) (envelope-from ) id 1lmE16-003vcg-9c; Thu, 27 May 2021 12:17:32 +0100 Date: Thu, 27 May 2021 12:17:31 +0100 Message-ID: <87zgwgs9x0.wl-maz@kernel.org> From: Marc Zyngier To: Valentin Schneider Cc: linux-kernel@vger.kernel.org, linux-arm-kernel@lists.infradead.org, Thomas Gleixner , Lorenzo Pieralisi , Vincenzo Frascino Subject: Re: [RFC PATCH v2 00/10] irqchip/irq-gic: Optimize masking by leveraging EOImode=1 In-Reply-To: <20210525173255.620606-1-valentin.schneider@arm.com> References: <20210525173255.620606-1-valentin.schneider@arm.com> User-Agent: Wanderlust/2.15.9 (Almost Unreal) SEMI-EPG/1.14.7 (Harue) FLIM-LB/1.14.9 (=?UTF-8?B?R29qxY0=?=) APEL-LB/10.8 EasyPG/1.0.0 Emacs/27.1 (x86_64-pc-linux-gnu) MULE/6.0 (HANACHIRUSATO) MIME-Version: 1.0 (generated by SEMI-EPG 1.14.7 - "Harue") X-SA-Exim-Connect-IP: 62.31.163.78 X-SA-Exim-Rcpt-To: valentin.schneider@arm.com, linux-kernel@vger.kernel.org, linux-arm-kernel@lists.infradead.org, tglx@linutronix.de, lorenzo.pieralisi@arm.com, vincenzo.frascino@arm.com X-SA-Exim-Mail-From: maz@kernel.org X-SA-Exim-Scanned: No (on disco-boy.misterjones.org); SAEximRunCond expanded to false X-CRM114-Version: 20100106-BlameMichelson ( TRE 0.8.0 (BSD) ) MR-646709E3 X-CRM114-CacheID: sfid-20210527_041734_947973_20B7955E X-CRM114-Status: GOOD ( 40.66 ) X-BeenThere: linux-arm-kernel@lists.infradead.org X-Mailman-Version: 2.1.34 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit Sender: "linux-arm-kernel" Errors-To: linux-arm-kernel-bounces+linux-arm-kernel=archiver.kernel.org@lists.infradead.org On Tue, 25 May 2021 18:32:45 +0100, Valentin Schneider wrote: > > Hi folks! > > This is the spiritual successor to [1], which was over 6 years ago (!). > > Revisions > ========= > > RFCv1 -> RFCv2 > ++++++++++++++ > > o Rebased against latest tip/irq/core > o Applied cleanups suggested by Thomas > > o Collected some performance results > > Background > ========== > > GIC mechanics > +++++++++++++ > > There are three IRQ operations: > o Acknowledge. This gives us the IRQ number that interrupted us, and also > - raises the running priority of the CPU interface to that of the IRQ > - sets the active bit of the IRQ > o Priority Drop. This "clears" the running priority. > o Deactivate. This clears the active bit of the IRQ. > > o The CPU interface has a running priority value. No interrupt of lower or > equal priority will be signaled to the CPU attached to that interface. On > Linux, we only have two priority values: pNMIs at highest priority, and > everything else at the other priority. > o Most GIC interrupts have an "active" bit. This bit is set on Acknowledge > and cleared on Deactivate. A given interrupt cannot be re-signaled to a > CPU if it has its active bit set (i.e. if it "fires" again while it's > being handled). > > EOImode fun > +++++++++++ > > In EOImode=0, Priority Drop and Deactivate are undissociable. The > (simplified) interrupt handling flow is as follows: > > <~IRQ> > Acknowledge > Priority Drop + Deactivate > > > With EOImode=1, we can invoke each operation individually. This gives us: > > <~IRQ> > Acknowledge > Priority Drop > <*other* interrupts can be signaled from here, once interrupts are re-enabled> > Deactivate > <*this* interrupt can be signaled again> > > What this means is that with EOImode=1, any interrupt is kept "masked" by > its active bit between Priority Drop and Deactivate. > > Threaded IRQs and ONESHOT > ========================= > > ONESHOT threaded IRQs must remain masked between the main handler and the > threaded handler. Right now we do this using the conventional irq_mask() > operations, which looks like this: > > > Acknowledge > Priority Drop > irq_mask() > Deactivate > > > irq_unmask() > > However, masking for the GICs means poking the distributor, and there's no > sysreg for that - it's an MMIO access. We've seen above that our IRQ > handling can give us masking "for free", and this is what this patch set is > all about. It turns the above handling into: > > > Acknowledge > Priority Drop > > > Deactivate > > No irq_mask() => fewer MMIO accesses => happier users (or so I've been > told). This is especially relevant to PREEMPT_RT which forces threaded > IRQs. > > Functional testing > ================== > > GICv2 > +++++ > > I've tested this on my Juno with forced irqthreads. This makes the pl011 > IRQ into a threaded ONESHOT IRQ, so I spammed my keyboard into the console > and verified via ftrace that there were no irq_mask() / irq_unmask() > involved. > > GICv3 > +++++ > > I've tested this on my Ampere eMAG, which uncovered "fun" interactions with > the MSI domains. Did the same trick as the Juno with the pl011. > > pNMIs cause said eMAG to freeze, but that's true even without my patches. I > did try them out under QEMU+KVM and that looked fine, although that means I > only got to test EOImode=0. I'll try to dig into this when I get some more > cycles. That's interesting/worrying. As far as I remember, this machine uses GIC500, which is a well known quantity. If pNMIs are causing issues, that'd probably be a CPU interface problem. Can you elaborate on how you tried to test that part? Just using the below benchmark? > > Performance impact > ================== > > Benchmark > +++++++++ > > Finding a benchmark that leverages a force-threaded IRQ has proved to be > somewhat of a pain, so I crafted my own. It's a bit daft, but so are most > benchmarks (though this one might win a prize). I love it (and wrote similar hacks in my time)! :D Can you put that up somewhere so that I can run the same test on my own zoo and find out how it fares? > > Long story short, I'm picking an unused IRQ and have it be > force-threaded. The benchmark then is: > > > loop: > irq_set_irqchip_state(irq, IRQCHIP_STATE_PENDING, true); > wait_for_completion(&done); > > > complete(&done); > > A more complete picture would be: > > > raise IRQ > wait > run flow handler > wake IRQ thread > finish handling > wake bench thread > > Letting this run for a fixed amount of time lets me measure an entire IRQ > handling cycle, which is what I'm after since there's one less mask() in > the flow handler and one less unmask() in the threaded handler. > > You'll note there's some potential "noise" in there due to scheduling both > the benchmark thread and the IRQ thread. However, the IRQ thread is pinned > to the IRQ's affinity, and I also pinned the benchmark thread in my tests, > which should keep this noise to a minimum. > > Results > +++++++ > > On a Juno r0, 20 iterations of 5 seconds of that benchmark yields > (measuring irqs/sec): > > | mean | median | 90th percentile | 99th percentile | > |------+--------+-----------------+-----------------| > | +11% | +11% | +12% | +14% | > > On an Ampere eMAG, 20 iterations of 5 seconds of that benchmark yields > (measuring irqs/sec): > > | mean | median | 90th percentile | 99th percentile | > |------+--------+-----------------+-----------------| > | +20% | +20% | +20% | +20% | > > This is still quite "artificial", but it reassures me in that skipping those > (un)mask operations can yield some measurable improvement. 20% improvement is even higher than I suspected! Thanks, M. -- Without deviation from the norm, progress is not possible. _______________________________________________ linux-arm-kernel mailing list linux-arm-kernel@lists.infradead.org http://lists.infradead.org/mailman/listinfo/linux-arm-kernel