* [PATCH] sched: reduce /proc/schedstat access times
@ 2012-02-02 20:55 Eric Dumazet
2012-02-03 7:52 ` Ingo Molnar
0 siblings, 1 reply; 2+ messages in thread
From: Eric Dumazet @ 2012-02-02 20:55 UTC (permalink / raw)
To: Ingo Molnar; +Cc: linux-kernel, Peter Zijlstra
On a 16 cpus NUMA machine, we can have quite a long /proc/schedstat
# wc -c /proc/schedstat
8355 /proc/schedstat
It appears show_schedstat() is called three times, because initial
seq_file buffer size is underestimated.
seq buffer must be reallocated two times and we spend 3x more time than
needed.
A quick fix is to maintain the maximum length reached in
show_schedstat() instead of guessing.
Also use seq_bitmap() to avoid the mask_str temp variable.
Signed-off-by: Eric Dumazet <eric.dumazet@gmail.com>
Cc: Ingo Molnar <mingo@elte.hu>
Cc: Peter Zijlstra <peterz@infradead.org>
---
kernel/sched/stats.c | 19 +++++++------------
1 file changed, 7 insertions(+), 12 deletions(-)
diff --git a/kernel/sched/stats.c b/kernel/sched/stats.c
index 2a581ba..34bb9c8 100644
--- a/kernel/sched/stats.c
+++ b/kernel/sched/stats.c
@@ -11,15 +11,11 @@
* format, so that tools can adapt (or abort)
*/
#define SCHEDSTAT_VERSION 15
+static size_t max_schedstat_len = 4096;
static int show_schedstat(struct seq_file *seq, void *v)
{
int cpu;
- int mask_len = DIV_ROUND_UP(NR_CPUS, 32) * 9;
- char *mask_str = kmalloc(mask_len, GFP_KERNEL);
-
- if (mask_str == NULL)
- return -ENOMEM;
seq_printf(seq, "version %d\n", SCHEDSTAT_VERSION);
seq_printf(seq, "timestamp %lu\n", jiffies);
@@ -47,9 +43,9 @@ static int show_schedstat(struct seq_file *seq, void *v)
for_each_domain(cpu, sd) {
enum cpu_idle_type itype;
- cpumask_scnprintf(mask_str, mask_len,
- sched_domain_span(sd));
- seq_printf(seq, "domain%d %s", dcount++, mask_str);
+ seq_printf(seq, "domain%d ", dcount++);
+ seq_bitmap(seq, cpumask_bits(sched_domain_span(sd)),
+ nr_cpumask_bits);
for (itype = CPU_IDLE; itype < CPU_MAX_IDLE_TYPES;
itype++) {
seq_printf(seq, " %u %u %u %u %u %u %u %u",
@@ -73,14 +69,13 @@ static int show_schedstat(struct seq_file *seq, void *v)
rcu_read_unlock();
#endif
}
- kfree(mask_str);
+ max_schedstat_len = max(max_schedstat_len, seq->count);
return 0;
}
static int schedstat_open(struct inode *inode, struct file *file)
{
- unsigned int size = PAGE_SIZE * (1 + num_online_cpus() / 32);
- char *buf = kmalloc(size, GFP_KERNEL);
+ char *buf = kmalloc(max_schedstat_len, GFP_KERNEL);
struct seq_file *m;
int res;
@@ -90,7 +85,7 @@ static int schedstat_open(struct inode *inode, struct file *file)
if (!res) {
m = file->private_data;
m->buf = buf;
- m->size = size;
+ m->size = ksize(buf);
} else
kfree(buf);
return res;
^ permalink raw reply related [flat|nested] 2+ messages in thread
* Re: [PATCH] sched: reduce /proc/schedstat access times
2012-02-02 20:55 [PATCH] sched: reduce /proc/schedstat access times Eric Dumazet
@ 2012-02-03 7:52 ` Ingo Molnar
0 siblings, 0 replies; 2+ messages in thread
From: Ingo Molnar @ 2012-02-03 7:52 UTC (permalink / raw)
To: Eric Dumazet; +Cc: linux-kernel, Peter Zijlstra, Arnaldo Carvalho de Melo
* Eric Dumazet <eric.dumazet@gmail.com> wrote:
> On a 16 cpus NUMA machine, we can have quite a long /proc/schedstat
>
> # wc -c /proc/schedstat
> 8355 /proc/schedstat
Btw., the long-term goal would be to make the schedstats info
fully available via perf and integrate it into 'perf sched' - or
'perf stat --sched' or 'perf schedstat' (whichever variant suits
the person who first implements it).
> @@ -47,9 +43,9 @@ static int show_schedstat(struct seq_file *seq, void *v)
> for_each_domain(cpu, sd) {
> enum cpu_idle_type itype;
>
> - cpumask_scnprintf(mask_str, mask_len,
> - sched_domain_span(sd));
> - seq_printf(seq, "domain%d %s", dcount++, mask_str);
> + seq_printf(seq, "domain%d ", dcount++);
> + seq_bitmap(seq, cpumask_bits(sched_domain_span(sd)),
> + nr_cpumask_bits);
> for (itype = CPU_IDLE; itype < CPU_MAX_IDLE_TYPES;
> itype++) {
> seq_printf(seq, " %u %u %u %u %u %u %u %u",
that way, via perf, all information gets passed in a binary
fashion through the perf ring-buffer, so there's no formatting
overhead (only during post-processing), no restart artifacts due
to seqfile limitations, etc.
Thanks,
Ingo
^ permalink raw reply [flat|nested] 2+ messages in thread
end of thread, other threads:[~2012-02-03 7:52 UTC | newest]
Thread overview: 2+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2012-02-02 20:55 [PATCH] sched: reduce /proc/schedstat access times Eric Dumazet
2012-02-03 7:52 ` Ingo Molnar
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).