linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v3 0/4] /proc/stat: Reduce irqs counting performance overhead
@ 2019-01-09 19:20 Waiman Long
  2019-01-09 19:20 ` [PATCH v3 1/4] /proc/stat: Extract irqs counting code into show_stat_irqs() Waiman Long
                   ` (3 more replies)
  0 siblings, 4 replies; 12+ messages in thread
From: Waiman Long @ 2019-01-09 19:20 UTC (permalink / raw)
  To: Andrew Morton, Alexey Dobriyan, Kees Cook, Thomas Gleixner
  Cc: linux-kernel, linux-fsdevel, Davidlohr Bueso, Miklos Szeredi,
	Daniel Colascione, Dave Chinner, Randy Dunlap, Matthew Wilcox,
	Waiman Long

 v1: https://lkml.org/lkml/2019/1/7/899
 v2: Fix a minor bug in patch 4 & update the cover-letter.

As newer systems have more and more IRQs and CPUs available in their
system, the performance of reading /proc/stat frequently is getting
worse and worse.

It appears that the idea of caching the IRQ counts in the v1 patch to
reduce the frequency of doing percpu summation and use a sysctl parameter
to control it was not well received.

I have looked into the use of percpu counters for counting interrupts.
However, the followings are the reasons why I don't think percpu counters
is the right choice for doing that.

 1) There is a raw spinlock in the percpu_counter structure that may
    need to be acquired in the update path. This can be a performance
    drag especially if lockdep is enabled.

 2) The percpu_counter structure is 40 bytes in size on 64-bit
    systems compared with just 8 bytes for the percpu count pointer and
    an additional 4 bytes that I introduced in patch 2 which may not
    actually increase the size of the IRQ descriptor. With thousands
    of irq descriptors, it can consume quite a lot more memory. Memory
    consumption was a point that had been brought up in the v1 patch
    review.

 3) Reading the patch 4 commit log, one can see that quite a bit of CPU
    cycles was spent looking up the radix tree to locate the IRQ
    descriptors for each of the interrupts. Those overhead will still
    be there even if I use percpu counters. So using percpu counter
    alone won't be as performant as this patch or my previous v1 patch.
    Patch 4 optimizes the descriptor lookup process which is independant
    of the percpu counter choice.

 4) Patches 2 and 3 are the patches that modify the percpu counting aspect
    of the IRQ counts. The number of changed lines of code is only 14. So
    they are very simple changes.

This new patch optimizes the way the IRQ counts are retrieved and getting
rid of the sysctl parameter altogether to achieve a performance gain
that is close to the v1 patch. This is based on the idea that while many
IRQs can be supported by a system, only a handful of them are actually
being used in most cases. We can save a lot of time by focusing on
those active IRQs only and ignore the rests.

Patch 1 is the same as that in v1 while the other 3 patches are new.

Waiman Long (4):
  /proc/stat: Extract irqs counting code into show_stat_irqs()
  /proc/stat: Only do percpu sum of active IRQs
  genirq: Track the number of active IRQs
  /proc/stat: Call kstat_irqs_usr() only for active IRQs

 fs/proc/stat.c          | 123 ++++++++++++++++++++++++++++++++++++++++++++----
 include/linux/irqdesc.h |   1 +
 kernel/irq/internals.h  |   6 ++-
 kernel/irq/irqdesc.c    |   7 ++-
 4 files changed, 125 insertions(+), 12 deletions(-)

-- 
1.8.3.1


^ permalink raw reply	[flat|nested] 12+ messages in thread

end of thread, other threads:[~2019-01-15 15:52 UTC | newest]

Thread overview: 12+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-01-09 19:20 [PATCH v3 0/4] /proc/stat: Reduce irqs counting performance overhead Waiman Long
2019-01-09 19:20 ` [PATCH v3 1/4] /proc/stat: Extract irqs counting code into show_stat_irqs() Waiman Long
2019-01-09 19:20 ` [PATCH v3 2/4] /proc/stat: Only do percpu sum of active IRQs Waiman Long
2019-01-09 19:20 ` [PATCH v3 3/4] genirq: Track the number " Waiman Long
2019-01-09 19:20 ` [PATCH v3 4/4] /proc/stat: Call kstat_irqs_usr() only for " Waiman Long
2019-01-11 17:23   ` Thomas Gleixner
2019-01-11 19:19     ` Thomas Gleixner
2019-01-11 19:23       ` Matthew Wilcox
2019-01-11 21:02         ` Thomas Gleixner
2019-01-14 19:04           ` Waiman Long
2019-01-15  9:24             ` Thomas Gleixner
2019-01-15 15:52               ` Waiman Long

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).