linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v3 0/4] /proc/stat: Reduce irqs counting performance overhead
@ 2019-01-09 19:20 Waiman Long
  2019-01-09 19:20 ` [PATCH v3 1/4] /proc/stat: Extract irqs counting code into show_stat_irqs() Waiman Long
                   ` (3 more replies)
  0 siblings, 4 replies; 12+ messages in thread
From: Waiman Long @ 2019-01-09 19:20 UTC (permalink / raw)
  To: Andrew Morton, Alexey Dobriyan, Kees Cook, Thomas Gleixner
  Cc: linux-kernel, linux-fsdevel, Davidlohr Bueso, Miklos Szeredi,
	Daniel Colascione, Dave Chinner, Randy Dunlap, Matthew Wilcox,
	Waiman Long

 v1: https://lkml.org/lkml/2019/1/7/899
 v2: Fix a minor bug in patch 4 & update the cover-letter.

As newer systems have more and more IRQs and CPUs available in their
system, the performance of reading /proc/stat frequently is getting
worse and worse.

It appears that the idea of caching the IRQ counts in the v1 patch to
reduce the frequency of doing percpu summation and use a sysctl parameter
to control it was not well received.

I have looked into the use of percpu counters for counting interrupts.
However, the followings are the reasons why I don't think percpu counters
is the right choice for doing that.

 1) There is a raw spinlock in the percpu_counter structure that may
    need to be acquired in the update path. This can be a performance
    drag especially if lockdep is enabled.

 2) The percpu_counter structure is 40 bytes in size on 64-bit
    systems compared with just 8 bytes for the percpu count pointer and
    an additional 4 bytes that I introduced in patch 2 which may not
    actually increase the size of the IRQ descriptor. With thousands
    of irq descriptors, it can consume quite a lot more memory. Memory
    consumption was a point that had been brought up in the v1 patch
    review.

 3) Reading the patch 4 commit log, one can see that quite a bit of CPU
    cycles was spent looking up the radix tree to locate the IRQ
    descriptors for each of the interrupts. Those overhead will still
    be there even if I use percpu counters. So using percpu counter
    alone won't be as performant as this patch or my previous v1 patch.
    Patch 4 optimizes the descriptor lookup process which is independant
    of the percpu counter choice.

 4) Patches 2 and 3 are the patches that modify the percpu counting aspect
    of the IRQ counts. The number of changed lines of code is only 14. So
    they are very simple changes.

This new patch optimizes the way the IRQ counts are retrieved and getting
rid of the sysctl parameter altogether to achieve a performance gain
that is close to the v1 patch. This is based on the idea that while many
IRQs can be supported by a system, only a handful of them are actually
being used in most cases. We can save a lot of time by focusing on
those active IRQs only and ignore the rests.

Patch 1 is the same as that in v1 while the other 3 patches are new.

Waiman Long (4):
  /proc/stat: Extract irqs counting code into show_stat_irqs()
  /proc/stat: Only do percpu sum of active IRQs
  genirq: Track the number of active IRQs
  /proc/stat: Call kstat_irqs_usr() only for active IRQs

 fs/proc/stat.c          | 123 ++++++++++++++++++++++++++++++++++++++++++++----
 include/linux/irqdesc.h |   1 +
 kernel/irq/internals.h  |   6 ++-
 kernel/irq/irqdesc.c    |   7 ++-
 4 files changed, 125 insertions(+), 12 deletions(-)

-- 
1.8.3.1


^ permalink raw reply	[flat|nested] 12+ messages in thread

* [PATCH v3 1/4] /proc/stat: Extract irqs counting code into show_stat_irqs()
  2019-01-09 19:20 [PATCH v3 0/4] /proc/stat: Reduce irqs counting performance overhead Waiman Long
@ 2019-01-09 19:20 ` Waiman Long
  2019-01-09 19:20 ` [PATCH v3 2/4] /proc/stat: Only do percpu sum of active IRQs Waiman Long
                   ` (2 subsequent siblings)
  3 siblings, 0 replies; 12+ messages in thread
From: Waiman Long @ 2019-01-09 19:20 UTC (permalink / raw)
  To: Andrew Morton, Alexey Dobriyan, Kees Cook, Thomas Gleixner
  Cc: linux-kernel, linux-fsdevel, Davidlohr Bueso, Miklos Szeredi,
	Daniel Colascione, Dave Chinner, Randy Dunlap, Matthew Wilcox,
	Waiman Long

The code that generates the "intr" line of /proc/stat is now moved
from show_stat() into a new function - show_stat_irqs(). There is no
functional change.

Signed-off-by: Waiman Long <longman@redhat.com>
Reviewed-by: Kees Cook <keescook@chromium.org>
---
 fs/proc/stat.c | 39 +++++++++++++++++++++++++++++----------
 1 file changed, 29 insertions(+), 10 deletions(-)

diff --git a/fs/proc/stat.c b/fs/proc/stat.c
index 535eda7..4b06f1b 100644
--- a/fs/proc/stat.c
+++ b/fs/proc/stat.c
@@ -79,12 +79,38 @@ static u64 get_iowait_time(int cpu)
 
 #endif
 
+static u64 compute_stat_irqs_sum(void)
+{
+	int i;
+	u64 sum = 0;
+
+	for_each_possible_cpu(i) {
+		sum += kstat_cpu_irqs_sum(i);
+		sum += arch_irq_stat_cpu(i);
+	}
+	sum += arch_irq_stat();
+	return sum;
+}
+
+/*
+ * Print out the "intr" line of /proc/stat.
+ */
+static void show_stat_irqs(struct seq_file *p)
+{
+	int i;
+
+	seq_put_decimal_ull(p, "intr ", compute_stat_irqs_sum());
+	for_each_irq_nr(i)
+		seq_put_decimal_ull(p, " ", kstat_irqs_usr(i));
+
+	seq_putc(p, '\n');
+}
+
 static int show_stat(struct seq_file *p, void *v)
 {
 	int i, j;
 	u64 user, nice, system, idle, iowait, irq, softirq, steal;
 	u64 guest, guest_nice;
-	u64 sum = 0;
 	u64 sum_softirq = 0;
 	unsigned int per_softirq_sums[NR_SOFTIRQS] = {0};
 	struct timespec64 boottime;
@@ -105,8 +131,6 @@ static int show_stat(struct seq_file *p, void *v)
 		steal += kcpustat_cpu(i).cpustat[CPUTIME_STEAL];
 		guest += kcpustat_cpu(i).cpustat[CPUTIME_GUEST];
 		guest_nice += kcpustat_cpu(i).cpustat[CPUTIME_GUEST_NICE];
-		sum += kstat_cpu_irqs_sum(i);
-		sum += arch_irq_stat_cpu(i);
 
 		for (j = 0; j < NR_SOFTIRQS; j++) {
 			unsigned int softirq_stat = kstat_softirqs_cpu(j, i);
@@ -115,7 +139,6 @@ static int show_stat(struct seq_file *p, void *v)
 			sum_softirq += softirq_stat;
 		}
 	}
-	sum += arch_irq_stat();
 
 	seq_put_decimal_ull(p, "cpu  ", nsec_to_clock_t(user));
 	seq_put_decimal_ull(p, " ", nsec_to_clock_t(nice));
@@ -154,14 +177,10 @@ static int show_stat(struct seq_file *p, void *v)
 		seq_put_decimal_ull(p, " ", nsec_to_clock_t(guest_nice));
 		seq_putc(p, '\n');
 	}
-	seq_put_decimal_ull(p, "intr ", (unsigned long long)sum);
-
-	/* sum again ? it could be updated? */
-	for_each_irq_nr(j)
-		seq_put_decimal_ull(p, " ", kstat_irqs_usr(j));
+	show_stat_irqs(p);
 
 	seq_printf(p,
-		"\nctxt %llu\n"
+		"ctxt %llu\n"
 		"btime %llu\n"
 		"processes %lu\n"
 		"procs_running %lu\n"
-- 
1.8.3.1


^ permalink raw reply related	[flat|nested] 12+ messages in thread

* [PATCH v3 2/4] /proc/stat: Only do percpu sum of active IRQs
  2019-01-09 19:20 [PATCH v3 0/4] /proc/stat: Reduce irqs counting performance overhead Waiman Long
  2019-01-09 19:20 ` [PATCH v3 1/4] /proc/stat: Extract irqs counting code into show_stat_irqs() Waiman Long
@ 2019-01-09 19:20 ` Waiman Long
  2019-01-09 19:20 ` [PATCH v3 3/4] genirq: Track the number " Waiman Long
  2019-01-09 19:20 ` [PATCH v3 4/4] /proc/stat: Call kstat_irqs_usr() only for " Waiman Long
  3 siblings, 0 replies; 12+ messages in thread
From: Waiman Long @ 2019-01-09 19:20 UTC (permalink / raw)
  To: Andrew Morton, Alexey Dobriyan, Kees Cook, Thomas Gleixner
  Cc: linux-kernel, linux-fsdevel, Davidlohr Bueso, Miklos Szeredi,
	Daniel Colascione, Dave Chinner, Randy Dunlap, Matthew Wilcox,
	Waiman Long

Recent computer systems may have hundreds or even thousands of IRQs
available. However, most of them may not be active and their IRQ counts
are zero. It is just a waste of CPU cycles to do percpu summation of
those zero counts.

In order to find out if an IRQ is active, we track the transition of the
percpu count from 0 to 1 and atomically increment a new kstat_irq_cpus
counter which counts the number of CPUs that handle this particular IRQ.

The IRQ descriptor is zalloc'ed, so there is no need to initialize the
new counter.

On a 4-socket Broadwell server wwith 112 vCPUs and 2952 IRQs (2877 of
them are 0), the system time needs to read /proc/stat 50k times was
reduced from 11.200s to 8.048s. That was a execution time reduction
of 28%.

Signed-off-by: Waiman Long <longman@redhat.com>
---
 include/linux/irqdesc.h | 1 +
 kernel/irq/internals.h  | 3 ++-
 kernel/irq/irqdesc.c    | 2 +-
 3 files changed, 4 insertions(+), 2 deletions(-)

diff --git a/include/linux/irqdesc.h b/include/linux/irqdesc.h
index dd1e40d..86bbad2 100644
--- a/include/linux/irqdesc.h
+++ b/include/linux/irqdesc.h
@@ -61,6 +61,7 @@ struct irq_desc {
 	irq_preflow_handler_t	preflow_handler;
 #endif
 	struct irqaction	*action;	/* IRQ action list */
+	atomic_t		kstat_irq_cpus;	/* #cpus handling this IRQ */
 	unsigned int		status_use_accessors;
 	unsigned int		core_internal_state__do_not_mess_with_it;
 	unsigned int		depth;		/* nested irq disables */
diff --git a/kernel/irq/internals.h b/kernel/irq/internals.h
index ca6afa2..31787c1 100644
--- a/kernel/irq/internals.h
+++ b/kernel/irq/internals.h
@@ -244,7 +244,8 @@ static inline void irq_state_set_masked(struct irq_desc *desc)
 
 static inline void kstat_incr_irqs_this_cpu(struct irq_desc *desc)
 {
-	__this_cpu_inc(*desc->kstat_irqs);
+	if (unlikely(__this_cpu_inc_return(*desc->kstat_irqs) == 1))
+		atomic_inc(&desc->kstat_irq_cpus);
 	__this_cpu_inc(kstat.irqs_sum);
 }
 
diff --git a/kernel/irq/irqdesc.c b/kernel/irq/irqdesc.c
index ee062b7..3d2c38b 100644
--- a/kernel/irq/irqdesc.c
+++ b/kernel/irq/irqdesc.c
@@ -922,7 +922,7 @@ unsigned int kstat_irqs(unsigned int irq)
 	int cpu;
 	unsigned int sum = 0;
 
-	if (!desc || !desc->kstat_irqs)
+	if (!desc || !desc->kstat_irqs || !atomic_read(&desc->kstat_irq_cpus))
 		return 0;
 	for_each_possible_cpu(cpu)
 		sum += *per_cpu_ptr(desc->kstat_irqs, cpu);
-- 
1.8.3.1


^ permalink raw reply related	[flat|nested] 12+ messages in thread

* [PATCH v3 3/4] genirq: Track the number of active IRQs
  2019-01-09 19:20 [PATCH v3 0/4] /proc/stat: Reduce irqs counting performance overhead Waiman Long
  2019-01-09 19:20 ` [PATCH v3 1/4] /proc/stat: Extract irqs counting code into show_stat_irqs() Waiman Long
  2019-01-09 19:20 ` [PATCH v3 2/4] /proc/stat: Only do percpu sum of active IRQs Waiman Long
@ 2019-01-09 19:20 ` Waiman Long
  2019-01-09 19:20 ` [PATCH v3 4/4] /proc/stat: Call kstat_irqs_usr() only for " Waiman Long
  3 siblings, 0 replies; 12+ messages in thread
From: Waiman Long @ 2019-01-09 19:20 UTC (permalink / raw)
  To: Andrew Morton, Alexey Dobriyan, Kees Cook, Thomas Gleixner
  Cc: linux-kernel, linux-fsdevel, Davidlohr Bueso, Miklos Szeredi,
	Daniel Colascione, Dave Chinner, Randy Dunlap, Matthew Wilcox,
	Waiman Long

A new atomic variable nr_active_irqs is added to track the number of
active IRQs that have one or more interrupts serviced.

Signed-off-by: Waiman Long <longman@redhat.com>
---
 kernel/irq/internals.h | 7 +++++--
 kernel/irq/irqdesc.c   | 5 +++++
 2 files changed, 10 insertions(+), 2 deletions(-)

diff --git a/kernel/irq/internals.h b/kernel/irq/internals.h
index 31787c1..239ae51 100644
--- a/kernel/irq/internals.h
+++ b/kernel/irq/internals.h
@@ -242,10 +242,13 @@ static inline void irq_state_set_masked(struct irq_desc *desc)
 
 #undef __irqd_to_state
 
+extern atomic_t nr_active_irqs;	/* # of active IRQs */
 static inline void kstat_incr_irqs_this_cpu(struct irq_desc *desc)
 {
-	if (unlikely(__this_cpu_inc_return(*desc->kstat_irqs) == 1))
-		atomic_inc(&desc->kstat_irq_cpus);
+	if (unlikely(__this_cpu_inc_return(*desc->kstat_irqs) == 1)) {
+		if (atomic_inc_return(&desc->kstat_irq_cpus) == 1)
+			atomic_inc(&nr_active_irqs);
+	}
 	__this_cpu_inc(kstat.irqs_sum);
 }
 
diff --git a/kernel/irq/irqdesc.c b/kernel/irq/irqdesc.c
index 3d2c38b..7e00c4d 100644
--- a/kernel/irq/irqdesc.c
+++ b/kernel/irq/irqdesc.c
@@ -20,6 +20,11 @@
 #include "internals.h"
 
 /*
+ * Number of active IRQs
+ */
+atomic_t nr_active_irqs;
+
+/*
  * lockdep: we want to handle all irq_desc locks as a single lock-class:
  */
 static struct lock_class_key irq_desc_lock_class;
-- 
1.8.3.1


^ permalink raw reply related	[flat|nested] 12+ messages in thread

* [PATCH v3 4/4] /proc/stat: Call kstat_irqs_usr() only for active IRQs
  2019-01-09 19:20 [PATCH v3 0/4] /proc/stat: Reduce irqs counting performance overhead Waiman Long
                   ` (2 preceding siblings ...)
  2019-01-09 19:20 ` [PATCH v3 3/4] genirq: Track the number " Waiman Long
@ 2019-01-09 19:20 ` Waiman Long
  2019-01-11 17:23   ` Thomas Gleixner
  3 siblings, 1 reply; 12+ messages in thread
From: Waiman Long @ 2019-01-09 19:20 UTC (permalink / raw)
  To: Andrew Morton, Alexey Dobriyan, Kees Cook, Thomas Gleixner
  Cc: linux-kernel, linux-fsdevel, Davidlohr Bueso, Miklos Szeredi,
	Daniel Colascione, Dave Chinner, Randy Dunlap, Matthew Wilcox,
	Waiman Long

After skipping the percpu summation of non-active IRQs on a 4-socket
Broadwell system with about 3k IRQs, about half of the CPU cycles were
spent in the kstat_irqs() call. The majority of which were used to look
up the IRQ descriptors for the corresponding IRQ numbers.

We can recoup a lot of those lost cycles by calling kstat_irqs_usr()
only for those IRQs that are active. A bitmap is now used to keep track
of the list of the active IRQs. Changes in nr_active_irqs count will
cause the code to rescan all the IRQs and repopulate the bitmap.

On the same 4-socket server, the introduction of this patch further
reduces the system time of reading /proc/stat 5k times from 8.048s
to 5.817s. This is a another time reduction of 28%.

Signed-off-by: Waiman Long <longman@redhat.com>
---
 fs/proc/stat.c | 84 ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
 1 file changed, 84 insertions(+)

diff --git a/fs/proc/stat.c b/fs/proc/stat.c
index 4b06f1b..24a7af6 100644
--- a/fs/proc/stat.c
+++ b/fs/proc/stat.c
@@ -93,6 +93,25 @@ static u64 compute_stat_irqs_sum(void)
 }
 
 /*
+ * Write the given number of space separated '0' into the sequence file.
+ */
+static void write_zeros(struct seq_file *p, int cnt)
+{
+	/* String of 16 '0's */
+	static const char zeros[] = " 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0";
+
+	while (cnt > 0) {
+		if (cnt >= 16) {
+			seq_write(p, zeros, 32);
+			cnt -= 16;
+		} else {
+			seq_write(p, zeros, 2 * cnt);
+			cnt = 0;
+		}
+	}
+}
+
+/*
  * Print out the "intr" line of /proc/stat.
  */
 static void show_stat_irqs(struct seq_file *p)
@@ -100,9 +119,74 @@ static void show_stat_irqs(struct seq_file *p)
 	int i;
 
 	seq_put_decimal_ull(p, "intr ", compute_stat_irqs_sum());
+
+	if (IS_ENABLED(CONFIG_SMP) && (nr_cpu_ids >= 10) && (nr_irqs >= 256)) {
+		/*
+		 * On systems with 10 or more CPUs and 256 or more IRQs,
+		 * we used a bitmap to keep track of the number of active
+		 * IRQs and call kstat_irqs_usr() only for those IRQs.
+		 * The bitmap will be refreshed whenever nr_active_irqs
+		 * changes.
+		 */
+		extern atomic_t nr_active_irqs;
+		static DEFINE_MUTEX(irqs_mutex);
+		static int last_irq = -1;
+		static int bitmap_size, active_irqs;
+		static unsigned long *bitmap;
+		int current_irqs = atomic_read(&nr_active_irqs);
+
+		mutex_lock(&irqs_mutex);
+		if (current_irqs != active_irqs) {
+			/*
+			 * Rescan all the IRQs for active ones.
+			 */
+			if (nr_irqs > bitmap_size) {
+				static unsigned long *new_bitmap;
+				static int new_size;
+
+				new_size = BITS_TO_LONGS(nr_irqs)*sizeof(long);
+				new_bitmap = (unsigned long *)krealloc(bitmap,
+						new_size, GFP_KERNEL);
+				if (!new_bitmap)
+					goto fallback;
+				bitmap = new_bitmap;
+				bitmap_size = new_size;
+			}
+			memset(bitmap, 0, bitmap_size/BITS_PER_BYTE);
+			last_irq = -1;
+			for_each_irq_nr(i) {
+				int cnt = kstat_irqs_usr(i);
+
+				if (cnt) {
+					bitmap_set(bitmap, 0, i);
+					last_irq = i;
+				}
+				seq_put_decimal_ull(p, " ", cnt);
+			}
+			active_irqs = current_irqs;
+			mutex_unlock(&irqs_mutex);
+			goto out;
+		}
+		/*
+		 * Retrieve counts from active IRQs only.
+		 */
+		for (i = 0; i <= last_irq; i++) {
+			int next = find_next_bit(bitmap, last_irq + 1, i);
+
+			if (next > i)
+				write_zeros(p, next - i);
+			i = next;
+			seq_put_decimal_ull(p, " ", kstat_irqs_usr(i));
+		}
+		mutex_unlock(&irqs_mutex);
+		write_zeros(p, nr_irqs - i);
+		goto out;
+	}
+fallback:
 	for_each_irq_nr(i)
 		seq_put_decimal_ull(p, " ", kstat_irqs_usr(i));
 
+out:
 	seq_putc(p, '\n');
 }
 
-- 
1.8.3.1


^ permalink raw reply related	[flat|nested] 12+ messages in thread

* Re: [PATCH v3 4/4] /proc/stat: Call kstat_irqs_usr() only for active IRQs
  2019-01-09 19:20 ` [PATCH v3 4/4] /proc/stat: Call kstat_irqs_usr() only for " Waiman Long
@ 2019-01-11 17:23   ` Thomas Gleixner
  2019-01-11 19:19     ` Thomas Gleixner
  0 siblings, 1 reply; 12+ messages in thread
From: Thomas Gleixner @ 2019-01-11 17:23 UTC (permalink / raw)
  To: Waiman Long
  Cc: Andrew Morton, Alexey Dobriyan, Kees Cook, linux-kernel,
	linux-fsdevel, Davidlohr Bueso, Miklos Szeredi,
	Daniel Colascione, Dave Chinner, Randy Dunlap, Matthew Wilcox

On Wed, 9 Jan 2019, Waiman Long wrote:
> ---
>  fs/proc/stat.c | 84 ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
>  1 file changed, 84 insertions(+)

and the total diffstat of that patch series is:

   4 files changed, 125 insertions(+), 12 deletions(-)

>  static void show_stat_irqs(struct seq_file *p)
> @@ -100,9 +119,74 @@ static void show_stat_irqs(struct seq_file *p)
>  	int i;
>  
>  	seq_put_decimal_ull(p, "intr ", compute_stat_irqs_sum());
> +
> +	if (IS_ENABLED(CONFIG_SMP) && (nr_cpu_ids >= 10) && (nr_irqs >= 256)) {
> +		/*
> +		 * On systems with 10 or more CPUs and 256 or more IRQs,
> +		 * we used a bitmap to keep track of the number of active
> +		 * IRQs and call kstat_irqs_usr() only for those IRQs.
> +		 * The bitmap will be refreshed whenever nr_active_irqs
> +		 * changes.
> +		 */
> +		extern atomic_t nr_active_irqs;
> +		static DEFINE_MUTEX(irqs_mutex);
> +		static int last_irq = -1;
> +		static int bitmap_size, active_irqs;
> +		static unsigned long *bitmap;
> +		int current_irqs = atomic_read(&nr_active_irqs);

This is completely overengineered. The simple patch below does not need
conditionals and atomics, is unconditional and completely avoids this
bitmap hackery.

On a VM with 144 VCPUs and nr_irqs=1576 and a loop of 5000 readouts of
/proc/stat (python):

  Before       	       After
  real	0m1.331s       0m0.728s	 -45.3%
  user	0m0.415s       0m0.359s  -13.5%
  sys	0m0.914s       0m0.356s  -61.1%

Hmm?

Thanks,

	tglx

8<------------------

 fs/proc/stat.c          |   28 +++++++++++++++++++++++++---
 include/linux/irqdesc.h |    3 ++-
 kernel/irq/internals.h  |    1 +
 kernel/irq/irqdesc.c    |    7 ++-----
 4 files changed, 30 insertions(+), 9 deletions(-)

--- a/fs/proc/stat.c
+++ b/fs/proc/stat.c
@@ -79,6 +79,30 @@ static u64 get_iowait_time(int cpu)
 
 #endif
 
+static void show_irq_gap(struct seq_file *p, int gap)
+{
+	static const char zeros[] = " 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0";
+
+	while (gap > 0) {
+		int inc = min_t(int, gap, ARRAY_SIZE(zeros) / 2);
+
+		seq_write(p, zeros, 2 * inc);
+		gap -= inc;
+	}
+}
+
+static void show_all_irqs(struct seq_file *p)
+{
+	int i, next = 0;
+
+	for_each_active_irq(i) {
+		show_irq_gap(p, i - next);
+		seq_put_decimal_ull(p, " ", kstat_irqs_usr(i));
+		next = i + 1;
+	}
+	show_irq_gap(p, nr_irqs - next);
+}
+
 static int show_stat(struct seq_file *p, void *v)
 {
 	int i, j;
@@ -156,9 +180,7 @@ static int show_stat(struct seq_file *p,
 	}
 	seq_put_decimal_ull(p, "intr ", (unsigned long long)sum);
 
-	/* sum again ? it could be updated? */
-	for_each_irq_nr(j)
-		seq_put_decimal_ull(p, " ", kstat_irqs_usr(j));
+	show_all_irqs(p);
 
 	seq_printf(p,
 		"\nctxt %llu\n"
--- a/include/linux/irqdesc.h
+++ b/include/linux/irqdesc.h
@@ -65,9 +65,10 @@ struct irq_desc {
 	unsigned int		core_internal_state__do_not_mess_with_it;
 	unsigned int		depth;		/* nested irq disables */
 	unsigned int		wake_depth;	/* nested wake enables */
+	unsigned int		tot_count;
 	unsigned int		irq_count;	/* For detecting broken IRQs */
-	unsigned long		last_unhandled;	/* Aging timer for unhandled count */
 	unsigned int		irqs_unhandled;
+	unsigned long		last_unhandled;	/* Aging timer for unhandled count */
 	atomic_t		threads_handled;
 	int			threads_handled_last;
 	raw_spinlock_t		lock;
--- a/kernel/irq/internals.h
+++ b/kernel/irq/internals.h
@@ -246,6 +246,7 @@ static inline void kstat_incr_irqs_this_
 {
 	__this_cpu_inc(*desc->kstat_irqs);
 	__this_cpu_inc(kstat.irqs_sum);
+	desc->tot_count++;
 }
 
 static inline int irq_desc_get_node(struct irq_desc *desc)
--- a/kernel/irq/irqdesc.c
+++ b/kernel/irq/irqdesc.c
@@ -119,6 +119,7 @@ static void desc_set_defaults(unsigned i
 	desc->depth = 1;
 	desc->irq_count = 0;
 	desc->irqs_unhandled = 0;
+	desc->tot_count = 0;
 	desc->name = NULL;
 	desc->owner = owner;
 	for_each_possible_cpu(cpu)
@@ -919,14 +920,10 @@ unsigned int kstat_irqs_cpu(unsigned int
 unsigned int kstat_irqs(unsigned int irq)
 {
 	struct irq_desc *desc = irq_to_desc(irq);
-	int cpu;
-	unsigned int sum = 0;
 
 	if (!desc || !desc->kstat_irqs)
 		return 0;
-	for_each_possible_cpu(cpu)
-		sum += *per_cpu_ptr(desc->kstat_irqs, cpu);
-	return sum;
+	return desc->tot_count;
 }
 
 /**



^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH v3 4/4] /proc/stat: Call kstat_irqs_usr() only for active IRQs
  2019-01-11 17:23   ` Thomas Gleixner
@ 2019-01-11 19:19     ` Thomas Gleixner
  2019-01-11 19:23       ` Matthew Wilcox
  0 siblings, 1 reply; 12+ messages in thread
From: Thomas Gleixner @ 2019-01-11 19:19 UTC (permalink / raw)
  To: Waiman Long
  Cc: Andrew Morton, Alexey Dobriyan, Kees Cook, linux-kernel,
	linux-fsdevel, Davidlohr Bueso, Miklos Szeredi,
	Daniel Colascione, Dave Chinner, Randy Dunlap, Matthew Wilcox

On Fri, 11 Jan 2019, Thomas Gleixner wrote:
> --- a/kernel/irq/internals.h
> +++ b/kernel/irq/internals.h
> @@ -246,6 +246,7 @@ static inline void kstat_incr_irqs_this_
>  {
>  	__this_cpu_inc(*desc->kstat_irqs);
>  	__this_cpu_inc(kstat.irqs_sum);
> +	desc->tot_count++;

There is one issue here. True percpu interrupts, like the timer interrupts
on ARM(64), will access that in parallel. But that's not rocket science to
fix.

Thanks,

	tglx



^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH v3 4/4] /proc/stat: Call kstat_irqs_usr() only for active IRQs
  2019-01-11 19:19     ` Thomas Gleixner
@ 2019-01-11 19:23       ` Matthew Wilcox
  2019-01-11 21:02         ` Thomas Gleixner
  0 siblings, 1 reply; 12+ messages in thread
From: Matthew Wilcox @ 2019-01-11 19:23 UTC (permalink / raw)
  To: Thomas Gleixner
  Cc: Waiman Long, Andrew Morton, Alexey Dobriyan, Kees Cook,
	linux-kernel, linux-fsdevel, Davidlohr Bueso, Miklos Szeredi,
	Daniel Colascione, Dave Chinner, Randy Dunlap

On Fri, Jan 11, 2019 at 08:19:33PM +0100, Thomas Gleixner wrote:
> On Fri, 11 Jan 2019, Thomas Gleixner wrote:
> > --- a/kernel/irq/internals.h
> > +++ b/kernel/irq/internals.h
> > @@ -246,6 +246,7 @@ static inline void kstat_incr_irqs_this_
> >  {
> >  	__this_cpu_inc(*desc->kstat_irqs);
> >  	__this_cpu_inc(kstat.irqs_sum);
> > +	desc->tot_count++;
> 
> There is one issue here. True percpu interrupts, like the timer interrupts
> on ARM(64), will access that in parallel. But that's not rocket science to
> fix.

I was wondering about that from an efficiency point of view.  Since
interrupts are generally targetted to a single CPU, there's no cacheline
bouncing to speak of, except for interrupts like TLB shootdown on x86.
It might make sense for the percpu interrupts to still sum them at read
time, and not sum them at interrupt time.

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH v3 4/4] /proc/stat: Call kstat_irqs_usr() only for active IRQs
  2019-01-11 19:23       ` Matthew Wilcox
@ 2019-01-11 21:02         ` Thomas Gleixner
  2019-01-14 19:04           ` Waiman Long
  0 siblings, 1 reply; 12+ messages in thread
From: Thomas Gleixner @ 2019-01-11 21:02 UTC (permalink / raw)
  To: Matthew Wilcox
  Cc: Waiman Long, Andrew Morton, Alexey Dobriyan, Kees Cook,
	linux-kernel, linux-fsdevel, Davidlohr Bueso, Miklos Szeredi,
	Daniel Colascione, Dave Chinner, Randy Dunlap

On Fri, 11 Jan 2019, Matthew Wilcox wrote:
> On Fri, Jan 11, 2019 at 08:19:33PM +0100, Thomas Gleixner wrote:
> > On Fri, 11 Jan 2019, Thomas Gleixner wrote:
> > > --- a/kernel/irq/internals.h
> > > +++ b/kernel/irq/internals.h
> > > @@ -246,6 +246,7 @@ static inline void kstat_incr_irqs_this_
> > >  {
> > >  	__this_cpu_inc(*desc->kstat_irqs);
> > >  	__this_cpu_inc(kstat.irqs_sum);
> > > +	desc->tot_count++;
> > 
> > There is one issue here. True percpu interrupts, like the timer interrupts
> > on ARM(64), will access that in parallel. But that's not rocket science to
> > fix.
> 
> I was wondering about that from an efficiency point of view.  Since
> interrupts are generally targetted to a single CPU, there's no cacheline
> bouncing to speak of, except for interrupts like TLB shootdown on x86.
> It might make sense for the percpu interrupts to still sum them at read
> time, and not sum them at interrupt time.

Yes, for regular interrupts the stats are properly serialized and the patch
works just fine. For true percpu interrupts we just can avoid the update of
tot_count and sum them up as before. The special vectors on x86 are
accounted separately anyway and not part of the problem here. That
show_all_irqs() code only deals with device interrupts which are backed by
irq descriptors. TLB/IPI/.... are different beasts and handled low level in
the architecture code. Though ARM and some others have these interrupts as
regular Linux interrupt numbers. That's why we need to care.

Updated untested patch below.

Thanks,

	tglx

8<-------------

 fs/proc/stat.c          |   28 +++++++++++++++++++++++++---
 include/linux/irqdesc.h |    3 ++-
 kernel/irq/chip.c       |   12 ++++++++++--
 kernel/irq/internals.h  |    8 +++++++-
 kernel/irq/irqdesc.c    |    7 ++++++-
 5 files changed, 50 insertions(+), 8 deletions(-)

--- a/fs/proc/stat.c
+++ b/fs/proc/stat.c
@@ -79,6 +79,30 @@ static u64 get_iowait_time(int cpu)
 
 #endif
 
+static void show_irq_gap(struct seq_file *p, int gap)
+{
+	static const char zeros[] = " 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0";
+
+	while (gap > 0) {
+		int inc = min_t(int, gap, ARRAY_SIZE(zeros) / 2);
+
+		seq_write(p, zeros, 2 * inc);
+		gap -= inc;
+	}
+}
+
+static void show_all_irqs(struct seq_file *p)
+{
+	int i, next = 0;
+
+	for_each_active_irq(i) {
+		show_irq_gap(p, i - next);
+		seq_put_decimal_ull(p, " ", kstat_irqs_usr(i));
+		next = i + 1;
+	}
+	show_irq_gap(p, nr_irqs - next);
+}
+
 static int show_stat(struct seq_file *p, void *v)
 {
 	int i, j;
@@ -156,9 +180,7 @@ static int show_stat(struct seq_file *p,
 	}
 	seq_put_decimal_ull(p, "intr ", (unsigned long long)sum);
 
-	/* sum again ? it could be updated? */
-	for_each_irq_nr(j)
-		seq_put_decimal_ull(p, " ", kstat_irqs_usr(j));
+	show_all_irqs(p);
 
 	seq_printf(p,
 		"\nctxt %llu\n"
--- a/include/linux/irqdesc.h
+++ b/include/linux/irqdesc.h
@@ -65,9 +65,10 @@ struct irq_desc {
 	unsigned int		core_internal_state__do_not_mess_with_it;
 	unsigned int		depth;		/* nested irq disables */
 	unsigned int		wake_depth;	/* nested wake enables */
+	unsigned int		tot_count;
 	unsigned int		irq_count;	/* For detecting broken IRQs */
-	unsigned long		last_unhandled;	/* Aging timer for unhandled count */
 	unsigned int		irqs_unhandled;
+	unsigned long		last_unhandled;	/* Aging timer for unhandled count */
 	atomic_t		threads_handled;
 	int			threads_handled_last;
 	raw_spinlock_t		lock;
--- a/kernel/irq/chip.c
+++ b/kernel/irq/chip.c
@@ -855,7 +855,11 @@ void handle_percpu_irq(struct irq_desc *
 {
 	struct irq_chip *chip = irq_desc_get_chip(desc);
 
-	kstat_incr_irqs_this_cpu(desc);
+	/*
+	 * PER CPU interrupts are not serialized. Do not touch
+	 * desc->tot_count.
+	 */
+	__kstat_incr_irqs_this_cpu(desc);
 
 	if (chip->irq_ack)
 		chip->irq_ack(&desc->irq_data);
@@ -884,7 +888,11 @@ void handle_percpu_devid_irq(struct irq_
 	unsigned int irq = irq_desc_get_irq(desc);
 	irqreturn_t res;
 
-	kstat_incr_irqs_this_cpu(desc);
+	/*
+	 * PER CPU interrupts are not serialized. Do not touch
+	 * desc->tot_count.
+	 */
+	__kstat_incr_irqs_this_cpu(desc);
 
 	if (chip->irq_ack)
 		chip->irq_ack(&desc->irq_data);
--- a/kernel/irq/internals.h
+++ b/kernel/irq/internals.h
@@ -242,12 +242,18 @@ static inline void irq_state_set_masked(
 
 #undef __irqd_to_state
 
-static inline void kstat_incr_irqs_this_cpu(struct irq_desc *desc)
+static inline void __kstat_incr_irqs_this_cpu(struct irq_desc *desc)
 {
 	__this_cpu_inc(*desc->kstat_irqs);
 	__this_cpu_inc(kstat.irqs_sum);
 }
 
+static inline void kstat_incr_irqs_this_cpu(struct irq_desc *desc)
+{
+	__kstat_incr_irqs_this_cpu(desc);
+	desc->tot_count++;
+}
+
 static inline int irq_desc_get_node(struct irq_desc *desc)
 {
 	return irq_common_data_get_node(&desc->irq_common_data);
--- a/kernel/irq/irqdesc.c
+++ b/kernel/irq/irqdesc.c
@@ -119,6 +119,7 @@ static void desc_set_defaults(unsigned i
 	desc->depth = 1;
 	desc->irq_count = 0;
 	desc->irqs_unhandled = 0;
+	desc->tot_count = 0;
 	desc->name = NULL;
 	desc->owner = owner;
 	for_each_possible_cpu(cpu)
@@ -919,11 +920,15 @@ unsigned int kstat_irqs_cpu(unsigned int
 unsigned int kstat_irqs(unsigned int irq)
 {
 	struct irq_desc *desc = irq_to_desc(irq);
-	int cpu;
 	unsigned int sum = 0;
+	int cpu;
 
 	if (!desc || !desc->kstat_irqs)
 		return 0;
+	if (!irq_settings_is_per_cpu_devid(desc) &&
+	    !irq_settings_is_per_cpu(desc))
+	    return desc->tot_count;
+
 	for_each_possible_cpu(cpu)
 		sum += *per_cpu_ptr(desc->kstat_irqs, cpu);
 	return sum;



^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH v3 4/4] /proc/stat: Call kstat_irqs_usr() only for active IRQs
  2019-01-11 21:02         ` Thomas Gleixner
@ 2019-01-14 19:04           ` Waiman Long
  2019-01-15  9:24             ` Thomas Gleixner
  0 siblings, 1 reply; 12+ messages in thread
From: Waiman Long @ 2019-01-14 19:04 UTC (permalink / raw)
  To: Thomas Gleixner, Matthew Wilcox
  Cc: Andrew Morton, Alexey Dobriyan, Kees Cook, linux-kernel,
	linux-fsdevel, Davidlohr Bueso, Miklos Szeredi,
	Daniel Colascione, Dave Chinner, Randy Dunlap

On 01/11/2019 04:02 PM, Thomas Gleixner wrote:
> On Fri, 11 Jan 2019, Matthew Wilcox wrote:
>> On Fri, Jan 11, 2019 at 08:19:33PM +0100, Thomas Gleixner wrote:
>>> On Fri, 11 Jan 2019, Thomas Gleixner wrote:
>>>> --- a/kernel/irq/internals.h
>>>> +++ b/kernel/irq/internals.h
>>>> @@ -246,6 +246,7 @@ static inline void kstat_incr_irqs_this_
>>>>  {
>>>>  	__this_cpu_inc(*desc->kstat_irqs);
>>>>  	__this_cpu_inc(kstat.irqs_sum);
>>>> +	desc->tot_count++;
>>> There is one issue here. True percpu interrupts, like the timer interrupts
>>> on ARM(64), will access that in parallel. But that's not rocket science to
>>> fix.
>> I was wondering about that from an efficiency point of view.  Since
>> interrupts are generally targetted to a single CPU, there's no cacheline
>> bouncing to speak of, except for interrupts like TLB shootdown on x86.
>> It might make sense for the percpu interrupts to still sum them at read
>> time, and not sum them at interrupt time.
> Yes, for regular interrupts the stats are properly serialized and the patch
> works just fine. For true percpu interrupts we just can avoid the update of
> tot_count and sum them up as before. The special vectors on x86 are
> accounted separately anyway and not part of the problem here. That
> show_all_irqs() code only deals with device interrupts which are backed by
> irq descriptors. TLB/IPI/.... are different beasts and handled low level in
> the architecture code. Though ARM and some others have these interrupts as
> regular Linux interrupt numbers. That's why we need to care.
>
> Updated untested patch below.
>
> Thanks,
>
> 	tglx
>
> 8<-------------
>
>  fs/proc/stat.c          |   28 +++++++++++++++++++++++++---
>  include/linux/irqdesc.h |    3 ++-
>  kernel/irq/chip.c       |   12 ++++++++++--
>  kernel/irq/internals.h  |    8 +++++++-
>  kernel/irq/irqdesc.c    |    7 ++++++-
>  5 files changed, 50 insertions(+), 8 deletions(-)
>
> --- a/fs/proc/stat.c
> +++ b/fs/proc/stat.c
> @@ -79,6 +79,30 @@ static u64 get_iowait_time(int cpu)
>  
>  #endif
>  
> +static void show_irq_gap(struct seq_file *p, int gap)
> +{
> +	static const char zeros[] = " 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0";
> +
> +	while (gap > 0) {
> +		int inc = min_t(int, gap, ARRAY_SIZE(zeros) / 2);
> +
> +		seq_write(p, zeros, 2 * inc);
> +		gap -= inc;
> +	}
> +}
> +
> +static void show_all_irqs(struct seq_file *p)
> +{
> +	int i, next = 0;
> +
> +	for_each_active_irq(i) {
> +		show_irq_gap(p, i - next);
> +		seq_put_decimal_ull(p, " ", kstat_irqs_usr(i));
> +		next = i + 1;
> +	}
> +	show_irq_gap(p, nr_irqs - next);
> +}
> +
>  static int show_stat(struct seq_file *p, void *v)
>  {
>  	int i, j;
> @@ -156,9 +180,7 @@ static int show_stat(struct seq_file *p,
>  	}
>  	seq_put_decimal_ull(p, "intr ", (unsigned long long)sum);
>  
> -	/* sum again ? it could be updated? */
> -	for_each_irq_nr(j)
> -		seq_put_decimal_ull(p, " ", kstat_irqs_usr(j));
> +	show_all_irqs(p);
>  
>  	seq_printf(p,
>  		"\nctxt %llu\n"
> --- a/include/linux/irqdesc.h
> +++ b/include/linux/irqdesc.h
> @@ -65,9 +65,10 @@ struct irq_desc {
>  	unsigned int		core_internal_state__do_not_mess_with_it;
>  	unsigned int		depth;		/* nested irq disables */
>  	unsigned int		wake_depth;	/* nested wake enables */
> +	unsigned int		tot_count;
>  	unsigned int		irq_count;	/* For detecting broken IRQs */
> -	unsigned long		last_unhandled;	/* Aging timer for unhandled count */
>  	unsigned int		irqs_unhandled;
> +	unsigned long		last_unhandled;	/* Aging timer for unhandled count */
>  	atomic_t		threads_handled;
>  	int			threads_handled_last;
>  	raw_spinlock_t		lock;
> --- a/kernel/irq/chip.c
> +++ b/kernel/irq/chip.c
> @@ -855,7 +855,11 @@ void handle_percpu_irq(struct irq_desc *
>  {
>  	struct irq_chip *chip = irq_desc_get_chip(desc);
>  
> -	kstat_incr_irqs_this_cpu(desc);
> +	/*
> +	 * PER CPU interrupts are not serialized. Do not touch
> +	 * desc->tot_count.
> +	 */
> +	__kstat_incr_irqs_this_cpu(desc);
>  
>  	if (chip->irq_ack)
>  		chip->irq_ack(&desc->irq_data);
> @@ -884,7 +888,11 @@ void handle_percpu_devid_irq(struct irq_
>  	unsigned int irq = irq_desc_get_irq(desc);
>  	irqreturn_t res;
>  
> -	kstat_incr_irqs_this_cpu(desc);
> +	/*
> +	 * PER CPU interrupts are not serialized. Do not touch
> +	 * desc->tot_count.
> +	 */
> +	__kstat_incr_irqs_this_cpu(desc);
>  
>  	if (chip->irq_ack)
>  		chip->irq_ack(&desc->irq_data);
> --- a/kernel/irq/internals.h
> +++ b/kernel/irq/internals.h
> @@ -242,12 +242,18 @@ static inline void irq_state_set_masked(
>  
>  #undef __irqd_to_state
>  
> -static inline void kstat_incr_irqs_this_cpu(struct irq_desc *desc)
> +static inline void __kstat_incr_irqs_this_cpu(struct irq_desc *desc)
>  {
>  	__this_cpu_inc(*desc->kstat_irqs);
>  	__this_cpu_inc(kstat.irqs_sum);
>  }
>  
> +static inline void kstat_incr_irqs_this_cpu(struct irq_desc *desc)
> +{
> +	__kstat_incr_irqs_this_cpu(desc);
> +	desc->tot_count++;
> +}
> +
>  static inline int irq_desc_get_node(struct irq_desc *desc)
>  {
>  	return irq_common_data_get_node(&desc->irq_common_data);
> --- a/kernel/irq/irqdesc.c
> +++ b/kernel/irq/irqdesc.c
> @@ -119,6 +119,7 @@ static void desc_set_defaults(unsigned i
>  	desc->depth = 1;
>  	desc->irq_count = 0;
>  	desc->irqs_unhandled = 0;
> +	desc->tot_count = 0;
>  	desc->name = NULL;
>  	desc->owner = owner;
>  	for_each_possible_cpu(cpu)
> @@ -919,11 +920,15 @@ unsigned int kstat_irqs_cpu(unsigned int
>  unsigned int kstat_irqs(unsigned int irq)
>  {
>  	struct irq_desc *desc = irq_to_desc(irq);
> -	int cpu;
>  	unsigned int sum = 0;
> +	int cpu;
>  
>  	if (!desc || !desc->kstat_irqs)
>  		return 0;
> +	if (!irq_settings_is_per_cpu_devid(desc) &&
> +	    !irq_settings_is_per_cpu(desc))
> +	    return desc->tot_count;
> +
>  	for_each_possible_cpu(cpu)
>  		sum += *per_cpu_ptr(desc->kstat_irqs, cpu);
>  	return sum;
>
>
That looks good to me. Thanks for providing a more simple solution.

BTW, if the percpu IRQ is known at allocation time, maybe we should just
not allocate a percpu count for the corresponding descriptor.

Cheers,
Longman



^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH v3 4/4] /proc/stat: Call kstat_irqs_usr() only for active IRQs
  2019-01-14 19:04           ` Waiman Long
@ 2019-01-15  9:24             ` Thomas Gleixner
  2019-01-15 15:52               ` Waiman Long
  0 siblings, 1 reply; 12+ messages in thread
From: Thomas Gleixner @ 2019-01-15  9:24 UTC (permalink / raw)
  To: Waiman Long
  Cc: Matthew Wilcox, Andrew Morton, Alexey Dobriyan, Kees Cook,
	linux-kernel, linux-fsdevel, Davidlohr Bueso, Miklos Szeredi,
	Daniel Colascione, Dave Chinner, Randy Dunlap

Waiman,

On Mon, 14 Jan 2019, Waiman Long wrote:
> On 01/11/2019 04:02 PM, Thomas Gleixner wrote:
> > @@ -919,11 +920,15 @@ unsigned int kstat_irqs_cpu(unsigned int
> >  unsigned int kstat_irqs(unsigned int irq)
> >  {
> >  	struct irq_desc *desc = irq_to_desc(irq);
> > -	int cpu;
> >  	unsigned int sum = 0;
> > +	int cpu;
> >  
> >  	if (!desc || !desc->kstat_irqs)
> >  		return 0;
> > +	if (!irq_settings_is_per_cpu_devid(desc) &&
> > +	    !irq_settings_is_per_cpu(desc))
> > +	    return desc->tot_count;
> > +
> >  	for_each_possible_cpu(cpu)
> >  		sum += *per_cpu_ptr(desc->kstat_irqs, cpu);
> >  	return sum;
> >
> >
> That looks good to me. Thanks for providing a more simple solution.
> 
> BTW, if the percpu IRQ is known at allocation time, maybe we should just
> not allocate a percpu count for the corresponding descriptor.

Nope. You still need the per cpu accounting for /proc/interrupts ...

Thanks,

	tglx

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH v3 4/4] /proc/stat: Call kstat_irqs_usr() only for active IRQs
  2019-01-15  9:24             ` Thomas Gleixner
@ 2019-01-15 15:52               ` Waiman Long
  0 siblings, 0 replies; 12+ messages in thread
From: Waiman Long @ 2019-01-15 15:52 UTC (permalink / raw)
  To: Thomas Gleixner
  Cc: Matthew Wilcox, Andrew Morton, Alexey Dobriyan, Kees Cook,
	linux-kernel, linux-fsdevel, Davidlohr Bueso, Miklos Szeredi,
	Daniel Colascione, Dave Chinner, Randy Dunlap

On 01/15/2019 04:24 AM, Thomas Gleixner wrote:
> Waiman,
>
> On Mon, 14 Jan 2019, Waiman Long wrote:
>> On 01/11/2019 04:02 PM, Thomas Gleixner wrote:
>>> @@ -919,11 +920,15 @@ unsigned int kstat_irqs_cpu(unsigned int
>>>  unsigned int kstat_irqs(unsigned int irq)
>>>  {
>>>  	struct irq_desc *desc = irq_to_desc(irq);
>>> -	int cpu;
>>>  	unsigned int sum = 0;
>>> +	int cpu;
>>>  
>>>  	if (!desc || !desc->kstat_irqs)
>>>  		return 0;
>>> +	if (!irq_settings_is_per_cpu_devid(desc) &&
>>> +	    !irq_settings_is_per_cpu(desc))
>>> +	    return desc->tot_count;
>>> +
>>>  	for_each_possible_cpu(cpu)
>>>  		sum += *per_cpu_ptr(desc->kstat_irqs, cpu);
>>>  	return sum;
>>>
>>>
>> That looks good to me. Thanks for providing a more simple solution.
>>
>> BTW, if the percpu IRQ is known at allocation time, maybe we should just
>> not allocate a percpu count for the corresponding descriptor.
> Nope. You still need the per cpu accounting for /proc/interrupts ..

Yes, you are right.

Thanks,
Longman

^ permalink raw reply	[flat|nested] 12+ messages in thread

end of thread, other threads:[~2019-01-15 15:52 UTC | newest]

Thread overview: 12+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2019-01-09 19:20 [PATCH v3 0/4] /proc/stat: Reduce irqs counting performance overhead Waiman Long
2019-01-09 19:20 ` [PATCH v3 1/4] /proc/stat: Extract irqs counting code into show_stat_irqs() Waiman Long
2019-01-09 19:20 ` [PATCH v3 2/4] /proc/stat: Only do percpu sum of active IRQs Waiman Long
2019-01-09 19:20 ` [PATCH v3 3/4] genirq: Track the number " Waiman Long
2019-01-09 19:20 ` [PATCH v3 4/4] /proc/stat: Call kstat_irqs_usr() only for " Waiman Long
2019-01-11 17:23   ` Thomas Gleixner
2019-01-11 19:19     ` Thomas Gleixner
2019-01-11 19:23       ` Matthew Wilcox
2019-01-11 21:02         ` Thomas Gleixner
2019-01-14 19:04           ` Waiman Long
2019-01-15  9:24             ` Thomas Gleixner
2019-01-15 15:52               ` Waiman Long

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).