linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v3 00/13] irqchip/irq-gic: Optimize masking by leveraging EOImode=1
@ 2021-06-29 12:49 Valentin Schneider
  2021-06-29 12:49 ` [PATCH v3 01/13] genirq: Add chip flag to denote automatic IRQ (un)masking Valentin Schneider
                   ` (12 more replies)
  0 siblings, 13 replies; 34+ messages in thread
From: Valentin Schneider @ 2021-06-29 12:49 UTC (permalink / raw)
  To: linux-kernel, linux-arm-kernel
  Cc: Marc Zyngier, Thomas Gleixner, Lorenzo Pieralisi, Vincenzo Frascino

Hi folks!

This is the spiritual successor to the below, which was over 6 years ago (!):
 https://lore.kernel.org/lkml/1414235215-10468-1-git-send-email-marc.zyngier@arm.com/

The series is available, along with my silly IRQ benchmark, at:
  https://git.gitlab.arm.com/linux-arm/linux-vs.git -b mainline/irq/eoimodness-v3

Revisions
=========

RFCv2 -> v3
+++++++++++

o Rebased on top of tip/irq/core:
  3d2ce675aba7 ("Merge tag 'irqchip-5.14' of git://git.kernel.org/pub/scm/linux/kernel/git/maz/arm-platforms into irq/core")
o Tested with irqchip.gicv3_pseudo_nmi=1 using Marc's fixes
  (arm64/for-next/cpuidle) on Ampere eMAG and Ampere Altra.
o Re-collected performance numbers for Juno and Ampere eMAG, also collected for
  Ampere Altra  
  
o Fixed s/irq_{ack, eoi}/{ack, eoi}_irq/ naming blunder (Marc)
o Gave msi_domain_update_chip_ops() default .irq_ack() and
  .irq_eoi() (Marc)

  Marc had suggested implementing a default callback that scans the domain
  hierarchy for .irq_ack / .irq_eoi() and calls the first non-NULL
  one. Now, things like nexus domains already have an irq_chip_eoi_parent();
  leaving this would defeat using a "smarter" version in child domains, and
  removing it felt like further obscuring the hierarchies. So just like
  turtles, I went with irq_chip_{ack, eoi}_parent() all the way down.

o Added .irq_ack() callbacks to relevant GIC gadgets (Marc)

  There might still be something to be done wrt chip flags, but I'll leave that
  as it is for now. See my ramblings at:
  http://lore.kernel.org/r/87lf7bb1ek.mognet@arm.com
  
RFCv1 -> RFCv2
++++++++++++++

o Rebased against latest tip/irq/core
o Applied cleanups suggested by Thomas

o Collected some performance results

Background
==========

GIC mechanics
+++++++++++++

There are three IRQ operations:
o Acknowledge. This gives us the IRQ number that interrupted us, and also
  - raises the running priority of the CPU interface to that of the IRQ
  - sets the active bit of the IRQ
o Priority Drop. This "clears" the running priority.
o Deactivate. This clears the active bit of the IRQ.

o The CPU interface has a running priority value. No interrupt of lower or
  equal priority will be signaled to the CPU attached to that interface. On
  Linux, we only have two priority values: pNMIs at highest priority, and
  everything else at the other priority.
o Most GIC interrupts have an "active" bit. This bit is set on Acknowledge
  and cleared on Deactivate. A given interrupt cannot be re-signaled to a
  CPU if it has its active bit set (i.e. if it "fires" again while it's
  being handled).

EOImode fun
+++++++++++

In EOImode=0, Priority Drop and Deactivate are undissociable. The
(simplified) interrupt handling flow is as follows: 

  <~IRQ>
    Acknowledge
    Priority Drop + Deactivate
    <interrupts can once again be signaled, once interrupts are re-enabled>

With EOImode=1, we can invoke each operation individually. This gives us:

  <~IRQ>
    Acknowledge
    Priority Drop
    <*other* interrupts can be signaled from here, once interrupts are re-enabled>
    Deactivate
    <*this* interrupt can be signaled again>

What this means is that with EOImode=1, any interrupt is kept "masked" by
its active bit between Priority Drop and Deactivate.

Threaded IRQs and ONESHOT
=========================

ONESHOT threaded IRQs must remain masked between the main handler and the
threaded handler. Right now we do this using the conventional irq_mask()
operations, which looks like this: 

 <irq handler>
   Acknowledge
   Priority Drop   
   irq_mask()
   Deactivate

 <threaded handler>
   irq_unmask()

However, masking for the GICs means poking the distributor, and there's no
sysreg for that - it's an MMIO access. We've seen above that our IRQ
handling can give us masking "for free", and this is what this patch set is
all about. It turns the above handling into:

  <irq handler>
    Acknowledge
    Priority Drop

  <threaded handler>
    Deactivate

No irq_mask() => fewer MMIO accesses => happier users (or so I've been
told). This is especially relevant to PREEMPT_RT which forces threaded
IRQs.
    
Functional testing
==================

GICv2
+++++

I've tested this on my Juno with forced irqthreads. This makes the pl011
IRQ into a threaded ONESHOT IRQ, so I spammed my keyboard into the console
and verified via ftrace that there were no irq_mask() / irq_unmask()
involved.

GICv3
+++++

I've tested this on my Ampere eMAG, which uncovered "fun" interactions with
the MSI domains. Did the same trick as the Juno with the pl011.

With Marc's pNMI vs cpuidle fixes (arm64/for-next/cpuidle), I also got to test
this against pNMIs on Ampere eMAG & Altra. Nothing to report here.

Performance impact
==================

Benchmark
+++++++++

Finding a benchmark that leverages a force-threaded IRQ has proved to be
somewhat of a pain, so I crafted my own. It's a bit daft, but so are most
benchmarks (though this one might win a prize).

Long story short, I'm picking an unused IRQ and have it be
force-threaded. The benchmark then is:

  <bench thread>
    loop:
      irq_set_irqchip_state(irq, IRQCHIP_STATE_PENDING, true);
      wait_for_completion(&done);

  <threaded handler>
    complete(&done);

A more complete picture would be:

  <bench thread>   <whatever is on CPU0>   <IRQ thread>
    raise IRQ
    wait
		    run flow handler
		      wake IRQ thread
					    finish handling
					    wake bench thread
    
Letting this run for a fixed amount of time lets me measure an entire IRQ
handling cycle, which is what I'm after since there's one less mask() in
the flow handler and one less unmask() in the threaded handler.

You'll note there's some potential "noise" in there due to scheduling both
the benchmark thread and the IRQ thread. However, the IRQ thread is pinned
to the IRQ's affinity, and I also pinned the benchmark thread in my tests,
which should keep this noise to a minimum.

Results
+++++++

20 iterations of 5 seconds of the above benchmark, measuring irqs/sec delta
between tip/irq/core and the series:

Juno r0:
| mean | median | 90th percentile | 99th percentile |
|------+--------+-----------------+-----------------|
| +6% |   +6%  |            +6% |            +6% |

Ampere eMAG:
| mean | median | 90th percentile | 99th percentile |
|------+--------+-----------------+-----------------|
| +21% |   +22% |            +20% |            +20% |

Ampere Altra:
| mean | median | 90th percentile | 99th percentile |
|------+--------+-----------------+-----------------|
| +22% |   +22% |            +22% |            +22% |


Cheers,
Valentin

Valentin Schneider (13):
  genirq: Add chip flag to denote automatic IRQ (un)masking
  genirq: Define ack_irq() and eoi_irq() helpers
  genirq: Employ ack_irq() and eoi_irq() where relevant
  genirq: Add handle_strict_flow_irq() flow handler
  genirq: Let purely flow-masked ONESHOT irqs through
    unmask_threaded_irq()
  genirq: Don't mask IRQ within flow handler if IRQ is flow-masked
  genirq, irq-gic-v3: Make NMI flow handlers use ->irq_ack() if
    available
  genirq/msi: Provide default .irq_eoi() for MSI chips
  irqchip/gic: Rely on MSI default .irq_eoi()
  genirq/msi: Provide default .irq_ack() for MSI chips
  irqchip/gic: Add .irq_ack() to GIC-based irqchips
  irqchip/gic: Convert to handle_strict_flow_irq()
  irqchip/gic-v3: Convert to handle_strict_flow_irq()

 drivers/base/platform-msi.c                 |   2 -
 drivers/irqchip/irq-gic-v2m.c               |   2 +-
 drivers/irqchip/irq-gic-v3-its-fsl-mc-msi.c |   1 -
 drivers/irqchip/irq-gic-v3-its-pci-msi.c    |   1 -
 drivers/irqchip/irq-gic-v3-its.c            |   3 +
 drivers/irqchip/irq-gic-v3-mbi.c            |   2 +-
 drivers/irqchip/irq-gic-v3.c                |  27 +++--
 drivers/irqchip/irq-gic.c                   |  14 ++-
 include/linux/irq.h                         |  15 ++-
 kernel/irq/chip.c                           | 122 +++++++++++++++++---
 kernel/irq/debugfs.c                        |   2 +
 kernel/irq/internals.h                      |   7 ++
 kernel/irq/manage.c                         |   2 +-
 kernel/irq/msi.c                            |   4 +
 14 files changed, 166 insertions(+), 38 deletions(-)

--
2.25.1


^ permalink raw reply	[flat|nested] 34+ messages in thread

end of thread, other threads:[~2021-08-12 21:38 UTC | newest]

Thread overview: 34+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2021-06-29 12:49 [PATCH v3 00/13] irqchip/irq-gic: Optimize masking by leveraging EOImode=1 Valentin Schneider
2021-06-29 12:49 ` [PATCH v3 01/13] genirq: Add chip flag to denote automatic IRQ (un)masking Valentin Schneider
2021-08-12 15:13   ` [irqchip: irq/irqchip-next] " irqchip-bot for Valentin Schneider
2021-06-29 12:49 ` [PATCH v3 02/13] genirq: Define ack_irq() and eoi_irq() helpers Valentin Schneider
2021-08-12  7:49   ` Marc Zyngier
2021-08-12 13:36     ` Valentin Schneider
2021-08-12 14:45       ` Marc Zyngier
2021-08-12 15:13   ` [irqchip: irq/irqchip-next] " irqchip-bot for Valentin Schneider
2021-06-29 12:50 ` [PATCH v3 03/13] genirq: Employ ack_irq() and eoi_irq() where relevant Valentin Schneider
2021-08-12 15:13   ` [irqchip: irq/irqchip-next] " irqchip-bot for Valentin Schneider
2021-06-29 12:50 ` [PATCH v3 04/13] genirq: Add handle_strict_flow_irq() flow handler Valentin Schneider
2021-08-12 15:13   ` [irqchip: irq/irqchip-next] " irqchip-bot for Valentin Schneider
2021-06-29 12:50 ` [PATCH v3 05/13] genirq: Let purely flow-masked ONESHOT irqs through unmask_threaded_irq() Valentin Schneider
2021-08-12  7:26   ` Marc Zyngier
2021-08-12 13:36     ` Valentin Schneider
2021-08-12 14:45       ` Marc Zyngier
2021-08-12 21:38         ` Valentin Schneider
2021-08-12 15:13   ` [irqchip: irq/irqchip-next] " irqchip-bot for Valentin Schneider
2021-06-29 12:50 ` [PATCH v3 06/13] genirq: Don't mask IRQ within flow handler if IRQ is flow-masked Valentin Schneider
2021-08-12 15:13   ` [irqchip: irq/irqchip-next] " irqchip-bot for Valentin Schneider
2021-06-29 12:50 ` [PATCH v3 07/13] genirq, irq-gic-v3: Make NMI flow handlers use ->irq_ack() if available Valentin Schneider
2021-08-12 15:13   ` [irqchip: irq/irqchip-next] " irqchip-bot for Valentin Schneider
2021-06-29 12:50 ` [PATCH v3 08/13] genirq/msi: Provide default .irq_eoi() for MSI chips Valentin Schneider
2021-08-12 15:13   ` [irqchip: irq/irqchip-next] " irqchip-bot for Valentin Schneider
2021-06-29 12:50 ` [PATCH v3 09/13] irqchip/gic: Rely on MSI default .irq_eoi() Valentin Schneider
2021-08-12 15:12   ` [irqchip: irq/irqchip-next] " irqchip-bot for Valentin Schneider
2021-06-29 12:50 ` [PATCH v3 10/13] genirq/msi: Provide default .irq_ack() for MSI chips Valentin Schneider
2021-08-12 15:12   ` [irqchip: irq/irqchip-next] " irqchip-bot for Valentin Schneider
2021-06-29 12:50 ` [PATCH v3 11/13] irqchip/gic: Add .irq_ack() to GIC-based irqchips Valentin Schneider
2021-08-12 15:12   ` [irqchip: irq/irqchip-next] " irqchip-bot for Valentin Schneider
2021-06-29 12:50 ` [PATCH v3 12/13] irqchip/gic: Convert to handle_strict_flow_irq() Valentin Schneider
2021-08-12 15:12   ` [irqchip: irq/irqchip-next] " irqchip-bot for Valentin Schneider
2021-06-29 12:50 ` [PATCH v3 13/13] irqchip/gic-v3: " Valentin Schneider
2021-08-12 15:12   ` [irqchip: irq/irqchip-next] " irqchip-bot for Valentin Schneider

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).