* [PATCH v4 0/4] cpuacct: split usage into user_usage and sys_usage @ 2016-03-22 8:37 Zhao Lei 2016-03-22 8:37 ` [PATCH v4 1/4] cpuacct: rename parameter in cpuusage_write for readability Zhao Lei ` (4 more replies) 0 siblings, 5 replies; 23+ messages in thread From: Zhao Lei @ 2016-03-22 8:37 UTC (permalink / raw) To: linux-kernel, Tejun Heo, Peter Zijlstra; +Cc: Zhao Lei Sometimes, cpuacct.usage is not detialed enough to user to see how much usage a group used. We want to know how much time it used in user mode and how much in kernel mode. This patch introduce some more files to tell user these information. # ls /sys/fs/cgroup/cpuacct/cpuacct.usage* /sys/fs/cgroup/cpuacct/cpuacct.usage /sys/fs/cgroup/cpuacct/cpuacct.usage_percpu /sys/fs/cgroup/cpuacct/cpuacct.usage_user /sys/fs/cgroup/cpuacct/cpuacct.usage_percpu_user /sys/fs/cgroup/cpuacct/cpuacct.usage_sys /sys/fs/cgroup/cpuacct/cpuacct.usage_percpu_sys Changelog v3->v4: 1: Add a patch to use for_each_possible_cpu to iterate all cpus, Suggested-by: Peter Zijlstra <peterz@infradead.org> Changelog v2->v3: 1: Remove some unnecessary locks and so some cleanup based on suggestion from Peter Zijlstra <peterz@infradead.org> Changelog v1->v2: 1: Rebase on top of 4.5-rc6 2: Fix little spelling typo in description. 3: Fix line over 80 characters Yang Dongsheng (2): cpuacct: rename parameter in cpuusage_write for readability cpuacct: split usage into user_usage and sys_usage. Zhao Lei (2): cpuacct: small restruct for cpuacct cpuacct: Show possible_cpu in cpuacct kernel/sched/cpuacct.c | 173 ++++++++++++++++++++++++++++++++++--------------- kernel/sched/cpuacct.h | 4 +- 2 files changed, 124 insertions(+), 53 deletions(-) -- 1.8.5.1 ^ permalink raw reply [flat|nested] 23+ messages in thread
* [PATCH v4 1/4] cpuacct: rename parameter in cpuusage_write for readability 2016-03-22 8:37 [PATCH v4 0/4] cpuacct: split usage into user_usage and sys_usage Zhao Lei @ 2016-03-22 8:37 ` Zhao Lei 2016-03-22 8:37 ` [PATCH v4 2/4] cpuacct: small restruct for cpuacct Zhao Lei ` (3 subsequent siblings) 4 siblings, 0 replies; 23+ messages in thread From: Zhao Lei @ 2016-03-22 8:37 UTC (permalink / raw) To: linux-kernel, Tejun Heo, Peter Zijlstra; +Cc: Yang Dongsheng, Zhao Lei From: Yang Dongsheng <yangds.fnst@cn.fujitsu.com> The name of 'reset' makes a little confusion in reading, we would say, if we want to reset usage, return -EINVAL. That's not true. Actually, we want to say, we only allow user to do a reset. This patch rename reset to val and add a comment here, making the code more readable. From: Dongsheng Yang <yangds.fnst@cn.fujitsu.com> Acked-by: Tejun Heo <tj@kernel.org> Signed-off-by: Zhao Lei <zhaolei@cn.fujitsu.com> --- kernel/sched/cpuacct.c | 7 +++++-- 1 file changed, 5 insertions(+), 2 deletions(-) diff --git a/kernel/sched/cpuacct.c b/kernel/sched/cpuacct.c index dd7cbb5..9c2bbf7 100644 --- a/kernel/sched/cpuacct.c +++ b/kernel/sched/cpuacct.c @@ -145,13 +145,16 @@ static u64 cpuusage_read(struct cgroup_subsys_state *css, struct cftype *cft) } static int cpuusage_write(struct cgroup_subsys_state *css, struct cftype *cft, - u64 reset) + u64 val) { struct cpuacct *ca = css_ca(css); int err = 0; int i; - if (reset) { + /* + * Only allow '0' here to do a reset. + */ + if (val) { err = -EINVAL; goto out; } -- 1.8.5.1 ^ permalink raw reply related [flat|nested] 23+ messages in thread
* [PATCH v4 2/4] cpuacct: small restruct for cpuacct 2016-03-22 8:37 [PATCH v4 0/4] cpuacct: split usage into user_usage and sys_usage Zhao Lei 2016-03-22 8:37 ` [PATCH v4 1/4] cpuacct: rename parameter in cpuusage_write for readability Zhao Lei @ 2016-03-22 8:37 ` Zhao Lei 2016-03-22 8:37 ` [PATCH v4 3/4] cpuacct: Show possible_cpu in cpuacct Zhao Lei ` (2 subsequent siblings) 4 siblings, 0 replies; 23+ messages in thread From: Zhao Lei @ 2016-03-22 8:37 UTC (permalink / raw) To: linux-kernel, Tejun Heo, Peter Zijlstra; +Cc: Zhao Lei 1: Use for() instead of while() loop in some functions to make code simple. 2: Use this_cpu_ptr() instead of per_cpu_ptr() to make code clean with little performance up. Suggested-by: Peter Zijlstra <peterz@infradead.org> Signed-off-by: Zhao Lei <zhaolei@cn.fujitsu.com> --- kernel/sched/cpuacct.c | 28 +++++----------------------- kernel/sched/cpuacct.h | 4 ++-- 2 files changed, 7 insertions(+), 25 deletions(-) diff --git a/kernel/sched/cpuacct.c b/kernel/sched/cpuacct.c index 9c2bbf7..434c2fa 100644 --- a/kernel/sched/cpuacct.c +++ b/kernel/sched/cpuacct.c @@ -238,23 +238,10 @@ static struct cftype files[] = { void cpuacct_charge(struct task_struct *tsk, u64 cputime) { struct cpuacct *ca; - int cpu; - - cpu = task_cpu(tsk); rcu_read_lock(); - - ca = task_ca(tsk); - - while (true) { - u64 *cpuusage = per_cpu_ptr(ca->cpuusage, cpu); - *cpuusage += cputime; - - ca = parent_ca(ca); - if (!ca) - break; - } - + for (ca = task_ca(tsk); ca; ca = parent_ca(ca)) + *this_cpu_ptr(ca->cpuusage) += cputime; rcu_read_unlock(); } @@ -263,18 +250,13 @@ void cpuacct_charge(struct task_struct *tsk, u64 cputime) * * Note: it's the caller that updates the account of the root cgroup. */ -void cpuacct_account_field(struct task_struct *p, int index, u64 val) +void cpuacct_account_field(struct task_struct *tsk, int index, u64 val) { - struct kernel_cpustat *kcpustat; struct cpuacct *ca; rcu_read_lock(); - ca = task_ca(p); - while (ca != &root_cpuacct) { - kcpustat = this_cpu_ptr(ca->cpustat); - kcpustat->cpustat[index] += val; - ca = parent_ca(ca); - } + for (ca = task_ca(tsk); ca != &root_cpuacct; ca = parent_ca(ca)) + this_cpu_ptr(ca->cpustat)->cpustat[index] += val; rcu_read_unlock(); } diff --git a/kernel/sched/cpuacct.h b/kernel/sched/cpuacct.h index ed60562..ba72807 100644 --- a/kernel/sched/cpuacct.h +++ b/kernel/sched/cpuacct.h @@ -1,7 +1,7 @@ #ifdef CONFIG_CGROUP_CPUACCT extern void cpuacct_charge(struct task_struct *tsk, u64 cputime); -extern void cpuacct_account_field(struct task_struct *p, int index, u64 val); +extern void cpuacct_account_field(struct task_struct *tsk, int index, u64 val); #else @@ -10,7 +10,7 @@ static inline void cpuacct_charge(struct task_struct *tsk, u64 cputime) } static inline void -cpuacct_account_field(struct task_struct *p, int index, u64 val) +cpuacct_account_field(struct task_struct *tsk, int index, u64 val) { } -- 1.8.5.1 ^ permalink raw reply related [flat|nested] 23+ messages in thread
* [PATCH v4 3/4] cpuacct: Show possible_cpu in cpuacct 2016-03-22 8:37 [PATCH v4 0/4] cpuacct: split usage into user_usage and sys_usage Zhao Lei 2016-03-22 8:37 ` [PATCH v4 1/4] cpuacct: rename parameter in cpuusage_write for readability Zhao Lei 2016-03-22 8:37 ` [PATCH v4 2/4] cpuacct: small restruct for cpuacct Zhao Lei @ 2016-03-22 8:37 ` Zhao Lei 2016-03-31 9:27 ` [tip:sched/core] sched/cpuacct: Show all possible CPUs in cpuacct output tip-bot for Zhao Lei 2016-03-22 8:37 ` [PATCH v4 4/4] cpuacct: split usage into user_usage and sys_usage Zhao Lei 2016-03-22 9:44 ` [PATCH v4 0/4] cpuacct: split usage into user_usage and sys_usage Peter Zijlstra 4 siblings, 1 reply; 23+ messages in thread From: Zhao Lei @ 2016-03-22 8:37 UTC (permalink / raw) To: linux-kernel, Tejun Heo, Peter Zijlstra; +Cc: Zhao Lei Current code show stats of online cpus in cpuacct.statcpus, show stats of present cpus in cpuacct.usage(_percpu), and using present cpus for setting cpuacct.usage. It will cause inconsistent result when a cpu is online or offline or hotpluged. We should always use possible cpus to avoid above problem. Content of cpuacct.usage_percpu changed after this patch: (On a 4 cpu system with maxcpus=32) Before patch: # cat cpuacct.usage_percpu 2456565 411435 1052897 832584 After patch: # cat cpuacct.usage_percpu 2456565 411435 1052897 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 Suggested-by: Peter Zijlstra <peterz@infradead.org> Signed-off-by: Zhao Lei <zhaolei@cn.fujitsu.com> --- kernel/sched/cpuacct.c | 10 +++++----- 1 file changed, 5 insertions(+), 5 deletions(-) diff --git a/kernel/sched/cpuacct.c b/kernel/sched/cpuacct.c index 434c2fa..4399850 100644 --- a/kernel/sched/cpuacct.c +++ b/kernel/sched/cpuacct.c @@ -138,7 +138,7 @@ static u64 cpuusage_read(struct cgroup_subsys_state *css, struct cftype *cft) u64 totalcpuusage = 0; int i; - for_each_present_cpu(i) + for_each_possible_cpu(i) totalcpuusage += cpuacct_cpuusage_read(ca, i); return totalcpuusage; @@ -159,7 +159,7 @@ static int cpuusage_write(struct cgroup_subsys_state *css, struct cftype *cft, goto out; } - for_each_present_cpu(i) + for_each_possible_cpu(i) cpuacct_cpuusage_write(ca, i, 0); out: @@ -172,7 +172,7 @@ static int cpuacct_percpu_seq_show(struct seq_file *m, void *V) u64 percpu; int i; - for_each_present_cpu(i) { + for_each_possible_cpu(i) { percpu = cpuacct_cpuusage_read(ca, i); seq_printf(m, "%llu ", (unsigned long long) percpu); } @@ -191,7 +191,7 @@ static int cpuacct_stats_show(struct seq_file *sf, void *v) int cpu; s64 val = 0; - for_each_online_cpu(cpu) { + for_each_possible_cpu(cpu) { struct kernel_cpustat *kcpustat = per_cpu_ptr(ca->cpustat, cpu); val += kcpustat->cpustat[CPUTIME_USER]; val += kcpustat->cpustat[CPUTIME_NICE]; @@ -200,7 +200,7 @@ static int cpuacct_stats_show(struct seq_file *sf, void *v) seq_printf(sf, "%s %lld\n", cpuacct_stat_desc[CPUACCT_STAT_USER], val); val = 0; - for_each_online_cpu(cpu) { + for_each_possible_cpu(cpu) { struct kernel_cpustat *kcpustat = per_cpu_ptr(ca->cpustat, cpu); val += kcpustat->cpustat[CPUTIME_SYSTEM]; val += kcpustat->cpustat[CPUTIME_IRQ]; -- 1.8.5.1 ^ permalink raw reply related [flat|nested] 23+ messages in thread
* [tip:sched/core] sched/cpuacct: Show all possible CPUs in cpuacct output 2016-03-22 8:37 ` [PATCH v4 3/4] cpuacct: Show possible_cpu in cpuacct Zhao Lei @ 2016-03-31 9:27 ` tip-bot for Zhao Lei 0 siblings, 0 replies; 23+ messages in thread From: tip-bot for Zhao Lei @ 2016-03-31 9:27 UTC (permalink / raw) To: linux-tip-commits Cc: efault, tglx, torvalds, zhaolei, htejun, hpa, peterz, mingo, linux-kernel Commit-ID: 5ca3726af7f66a8cc71ce4414cfeb86deb784491 Gitweb: http://git.kernel.org/tip/5ca3726af7f66a8cc71ce4414cfeb86deb784491 Author: Zhao Lei <zhaolei@cn.fujitsu.com> AuthorDate: Tue, 22 Mar 2016 16:37:07 +0800 Committer: Ingo Molnar <mingo@kernel.org> CommitDate: Thu, 31 Mar 2016 10:45:56 +0200 sched/cpuacct: Show all possible CPUs in cpuacct output Current code show stats of online CPUs in cpuacct.statcpus, show stats of present cpus in cpuacct.usage(_percpu), and using present CPUs for setting cpuacct.usage. It will cause inconsistent result when a CPU is online or offline or hotpluged. We should always use possible CPUs to avoid above problem. Here are the contents of a cpuacct.usage_percpu sysfs file, on a 4 CPU system with maxcpus=32: Before the patch: # cat cpuacct.usage_percpu 2456565 411435 1052897 832584 After the patch: # cat cpuacct.usage_percpu 2456565 411435 1052897 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 Suggested-by: Peter Zijlstra <peterz@infradead.org> Signed-off-by: Zhao Lei <zhaolei@cn.fujitsu.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Mike Galbraith <efault@gmx.de> Cc: Tejun Heo <htejun@gmail.com> Cc: Thomas Gleixner <tglx@linutronix.de> Link: http://lkml.kernel.org/r/a11d56cef12d0b4807f8be3a46bf9798c3014d59.1458635566.git.zhaolei@cn.fujitsu.com Signed-off-by: Ingo Molnar <mingo@kernel.org> --- kernel/sched/cpuacct.c | 10 +++++----- 1 file changed, 5 insertions(+), 5 deletions(-) diff --git a/kernel/sched/cpuacct.c b/kernel/sched/cpuacct.c index 4a81120..e39bd4f 100644 --- a/kernel/sched/cpuacct.c +++ b/kernel/sched/cpuacct.c @@ -138,7 +138,7 @@ static u64 cpuusage_read(struct cgroup_subsys_state *css, struct cftype *cft) u64 totalcpuusage = 0; int i; - for_each_present_cpu(i) + for_each_possible_cpu(i) totalcpuusage += cpuacct_cpuusage_read(ca, i); return totalcpuusage; @@ -159,7 +159,7 @@ static int cpuusage_write(struct cgroup_subsys_state *css, struct cftype *cft, goto out; } - for_each_present_cpu(i) + for_each_possible_cpu(i) cpuacct_cpuusage_write(ca, i, 0); out: @@ -172,7 +172,7 @@ static int cpuacct_percpu_seq_show(struct seq_file *m, void *V) u64 percpu; int i; - for_each_present_cpu(i) { + for_each_possible_cpu(i) { percpu = cpuacct_cpuusage_read(ca, i); seq_printf(m, "%llu ", (unsigned long long) percpu); } @@ -191,7 +191,7 @@ static int cpuacct_stats_show(struct seq_file *sf, void *v) int cpu; s64 val = 0; - for_each_online_cpu(cpu) { + for_each_possible_cpu(cpu) { struct kernel_cpustat *kcpustat = per_cpu_ptr(ca->cpustat, cpu); val += kcpustat->cpustat[CPUTIME_USER]; val += kcpustat->cpustat[CPUTIME_NICE]; @@ -200,7 +200,7 @@ static int cpuacct_stats_show(struct seq_file *sf, void *v) seq_printf(sf, "%s %lld\n", cpuacct_stat_desc[CPUACCT_STAT_USER], val); val = 0; - for_each_online_cpu(cpu) { + for_each_possible_cpu(cpu) { struct kernel_cpustat *kcpustat = per_cpu_ptr(ca->cpustat, cpu); val += kcpustat->cpustat[CPUTIME_SYSTEM]; val += kcpustat->cpustat[CPUTIME_IRQ]; ^ permalink raw reply related [flat|nested] 23+ messages in thread
* [PATCH v4 4/4] cpuacct: split usage into user_usage and sys_usage 2016-03-22 8:37 [PATCH v4 0/4] cpuacct: split usage into user_usage and sys_usage Zhao Lei ` (2 preceding siblings ...) 2016-03-22 8:37 ` [PATCH v4 3/4] cpuacct: Show possible_cpu in cpuacct Zhao Lei @ 2016-03-22 8:37 ` Zhao Lei 2016-03-31 9:27 ` [tip:sched/core] sched/cpuacct: Split usage accounting " tip-bot for Dongsheng Yang 2016-03-22 9:44 ` [PATCH v4 0/4] cpuacct: split usage into user_usage and sys_usage Peter Zijlstra 4 siblings, 1 reply; 23+ messages in thread From: Zhao Lei @ 2016-03-22 8:37 UTC (permalink / raw) To: linux-kernel, Tejun Heo, Peter Zijlstra; +Cc: Dongsheng Yang, Zhao Lei From: Dongsheng Yang <yangds.fnst@cn.fujitsu.com> Sometimes, cpuacct.usage is not detialed enough to user to see how much usage a group used. We want to know how much time it used in user mode and how much in kernel mode. This patch introduce some more files to tell user these information. # ls /sys/fs/cgroup/cpuacct/cpuacct.usage* /sys/fs/cgroup/cpuacct/cpuacct.usage /sys/fs/cgroup/cpuacct/cpuacct.usage_percpu /sys/fs/cgroup/cpuacct/cpuacct.usage_user /sys/fs/cgroup/cpuacct/cpuacct.usage_percpu_user /sys/fs/cgroup/cpuacct/cpuacct.usage_sys /sys/fs/cgroup/cpuacct/cpuacct.usage_percpu_sys From: Dongsheng Yang <yangds.fnst@cn.fujitsu.com> Acked-by: Tejun Heo <tj@kernel.org> Signed-off-by: Zhao Lei <zhaolei@cn.fujitsu.com> --- kernel/sched/cpuacct.c | 140 +++++++++++++++++++++++++++++++++++++++---------- 1 file changed, 113 insertions(+), 27 deletions(-) diff --git a/kernel/sched/cpuacct.c b/kernel/sched/cpuacct.c index 4399850..6a4de22 100644 --- a/kernel/sched/cpuacct.c +++ b/kernel/sched/cpuacct.c @@ -25,11 +25,22 @@ enum cpuacct_stat_index { CPUACCT_STAT_NSTATS, }; +enum cpuacct_usage_index { + CPUACCT_USAGE_USER, /* ... user mode */ + CPUACCT_USAGE_SYSTEM, /* ... kernel mode */ + + CPUACCT_USAGE_NRUSAGE, +}; + +struct cpuacct_usage { + u64 usages[CPUACCT_USAGE_NRUSAGE]; +}; + /* track cpu usage of a group of tasks and its child groups */ struct cpuacct { struct cgroup_subsys_state css; /* cpuusage holds pointer to a u64-type object on every cpu */ - u64 __percpu *cpuusage; + struct cpuacct_usage __percpu *cpuusage; struct kernel_cpustat __percpu *cpustat; }; @@ -49,7 +60,7 @@ static inline struct cpuacct *parent_ca(struct cpuacct *ca) return css_ca(ca->css.parent); } -static DEFINE_PER_CPU(u64, root_cpuacct_cpuusage); +static DEFINE_PER_CPU(struct cpuacct_usage, root_cpuacct_cpuusage); static struct cpuacct root_cpuacct = { .cpustat = &kernel_cpustat, .cpuusage = &root_cpuacct_cpuusage, @@ -68,7 +79,7 @@ cpuacct_css_alloc(struct cgroup_subsys_state *parent_css) if (!ca) goto out; - ca->cpuusage = alloc_percpu(u64); + ca->cpuusage = alloc_percpu(struct cpuacct_usage); if (!ca->cpuusage) goto out_free_ca; @@ -96,20 +107,37 @@ static void cpuacct_css_free(struct cgroup_subsys_state *css) kfree(ca); } -static u64 cpuacct_cpuusage_read(struct cpuacct *ca, int cpu) +static u64 cpuacct_cpuusage_read(struct cpuacct *ca, int cpu, + enum cpuacct_usage_index index) { - u64 *cpuusage = per_cpu_ptr(ca->cpuusage, cpu); + struct cpuacct_usage *cpuusage = per_cpu_ptr(ca->cpuusage, cpu); u64 data; + /* + * We allow index == CPUACCT_USAGE_NRUSAGE here to read + * the sum of suages. + */ + BUG_ON(index > CPUACCT_USAGE_NRUSAGE); + #ifndef CONFIG_64BIT /* * Take rq->lock to make 64-bit read safe on 32-bit platforms. */ raw_spin_lock_irq(&cpu_rq(cpu)->lock); - data = *cpuusage; +#endif + + if (index == CPUACCT_USAGE_NRUSAGE) { + int i = 0; + + data = 0; + for (i = 0; i < CPUACCT_USAGE_NRUSAGE; i++) + data += cpuusage->usages[i]; + } else { + data = cpuusage->usages[index]; + } + +#ifndef CONFIG_64BIT raw_spin_unlock_irq(&cpu_rq(cpu)->lock); -#else - data = *cpuusage; #endif return data; @@ -117,69 +145,103 @@ static u64 cpuacct_cpuusage_read(struct cpuacct *ca, int cpu) static void cpuacct_cpuusage_write(struct cpuacct *ca, int cpu, u64 val) { - u64 *cpuusage = per_cpu_ptr(ca->cpuusage, cpu); + struct cpuacct_usage *cpuusage = per_cpu_ptr(ca->cpuusage, cpu); + int i; #ifndef CONFIG_64BIT /* * Take rq->lock to make 64-bit write safe on 32-bit platforms. */ raw_spin_lock_irq(&cpu_rq(cpu)->lock); - *cpuusage = val; +#endif + + for (i = 0; i < CPUACCT_USAGE_NRUSAGE; i++) + cpuusage->usages[i] = val; + +#ifndef CONFIG_64BIT raw_spin_unlock_irq(&cpu_rq(cpu)->lock); -#else - *cpuusage = val; #endif } /* return total cpu usage (in nanoseconds) of a group */ -static u64 cpuusage_read(struct cgroup_subsys_state *css, struct cftype *cft) +static u64 __cpuusage_read(struct cgroup_subsys_state *css, + enum cpuacct_usage_index index) { struct cpuacct *ca = css_ca(css); u64 totalcpuusage = 0; int i; for_each_possible_cpu(i) - totalcpuusage += cpuacct_cpuusage_read(ca, i); + totalcpuusage += cpuacct_cpuusage_read(ca, i, index); return totalcpuusage; } +static u64 cpuusage_user_read(struct cgroup_subsys_state *css, + struct cftype *cft) +{ + return __cpuusage_read(css, CPUACCT_USAGE_USER); +} + +static u64 cpuusage_sys_read(struct cgroup_subsys_state *css, + struct cftype *cft) +{ + return __cpuusage_read(css, CPUACCT_USAGE_SYSTEM); +} + +static u64 cpuusage_read(struct cgroup_subsys_state *css, struct cftype *cft) +{ + return __cpuusage_read(css, CPUACCT_USAGE_NRUSAGE); +} + static int cpuusage_write(struct cgroup_subsys_state *css, struct cftype *cft, u64 val) { struct cpuacct *ca = css_ca(css); - int err = 0; - int i; + int cpu; /* * Only allow '0' here to do a reset. */ - if (val) { - err = -EINVAL; - goto out; - } + if (val) + return -EINVAL; - for_each_possible_cpu(i) - cpuacct_cpuusage_write(ca, i, 0); + for_each_possible_cpu(cpu) + cpuacct_cpuusage_write(ca, cpu, 0); -out: - return err; + return 0; } -static int cpuacct_percpu_seq_show(struct seq_file *m, void *V) +static int __cpuacct_percpu_seq_show(struct seq_file *m, + enum cpuacct_usage_index index) { struct cpuacct *ca = css_ca(seq_css(m)); u64 percpu; int i; for_each_possible_cpu(i) { - percpu = cpuacct_cpuusage_read(ca, i); + percpu = cpuacct_cpuusage_read(ca, i, index); seq_printf(m, "%llu ", (unsigned long long) percpu); } seq_printf(m, "\n"); return 0; } +static int cpuacct_percpu_user_seq_show(struct seq_file *m, void *V) +{ + return __cpuacct_percpu_seq_show(m, CPUACCT_USAGE_USER); +} + +static int cpuacct_percpu_sys_seq_show(struct seq_file *m, void *V) +{ + return __cpuacct_percpu_seq_show(m, CPUACCT_USAGE_SYSTEM); +} + +static int cpuacct_percpu_seq_show(struct seq_file *m, void *V) +{ + return __cpuacct_percpu_seq_show(m, CPUACCT_USAGE_NRUSAGE); +} + static const char * const cpuacct_stat_desc[] = { [CPUACCT_STAT_USER] = "user", [CPUACCT_STAT_SYSTEM] = "system", @@ -220,10 +282,26 @@ static struct cftype files[] = { .write_u64 = cpuusage_write, }, { + .name = "usage_user", + .read_u64 = cpuusage_user_read, + }, + { + .name = "usage_sys", + .read_u64 = cpuusage_sys_read, + }, + { .name = "usage_percpu", .seq_show = cpuacct_percpu_seq_show, }, { + .name = "usage_percpu_user", + .seq_show = cpuacct_percpu_user_seq_show, + }, + { + .name = "usage_percpu_sys", + .seq_show = cpuacct_percpu_sys_seq_show, + }, + { .name = "stat", .seq_show = cpuacct_stats_show, }, @@ -238,10 +316,18 @@ static struct cftype files[] = { void cpuacct_charge(struct task_struct *tsk, u64 cputime) { struct cpuacct *ca; + int index; + + if (user_mode(task_pt_regs(tsk))) + index = CPUACCT_USAGE_USER; + else + index = CPUACCT_USAGE_SYSTEM; rcu_read_lock(); + for (ca = task_ca(tsk); ca; ca = parent_ca(ca)) - *this_cpu_ptr(ca->cpuusage) += cputime; + this_cpu_ptr(ca->cpuusage)->usages[index] += cputime; + rcu_read_unlock(); } -- 1.8.5.1 ^ permalink raw reply related [flat|nested] 23+ messages in thread
* [tip:sched/core] sched/cpuacct: Split usage accounting into user_usage and sys_usage 2016-03-22 8:37 ` [PATCH v4 4/4] cpuacct: split usage into user_usage and sys_usage Zhao Lei @ 2016-03-31 9:27 ` tip-bot for Dongsheng Yang 2016-04-04 14:03 ` Srikar Dronamraju 0 siblings, 1 reply; 23+ messages in thread From: tip-bot for Dongsheng Yang @ 2016-03-31 9:27 UTC (permalink / raw) To: linux-tip-commits Cc: yangds.fnst, zhaolei, peterz, mingo, hpa, tglx, linux-kernel, tj, torvalds, htejun, efault Commit-ID: d740037fac7052e49450f6fa1454f1144a103b55 Gitweb: http://git.kernel.org/tip/d740037fac7052e49450f6fa1454f1144a103b55 Author: Dongsheng Yang <yangds.fnst@cn.fujitsu.com> AuthorDate: Tue, 22 Mar 2016 16:37:08 +0800 Committer: Ingo Molnar <mingo@kernel.org> CommitDate: Thu, 31 Mar 2016 10:48:54 +0200 sched/cpuacct: Split usage accounting into user_usage and sys_usage Sometimes, cpuacct.usage is not detailed enough to see how much CPU usage a group had. We want to know how much time it used in user mode and how much in kernel mode. This patch introduces more files to give this information: # ls /sys/fs/cgroup/cpuacct/cpuacct.usage* /sys/fs/cgroup/cpuacct/cpuacct.usage /sys/fs/cgroup/cpuacct/cpuacct.usage_percpu /sys/fs/cgroup/cpuacct/cpuacct.usage_user /sys/fs/cgroup/cpuacct/cpuacct.usage_percpu_user /sys/fs/cgroup/cpuacct/cpuacct.usage_sys /sys/fs/cgroup/cpuacct/cpuacct.usage_percpu_sys ... while keeping the ABI with the existing counter. Signed-off-by: Dongsheng Yang <yangds.fnst@cn.fujitsu.com> [ Ported to newer kernels. ] Signed-off-by: Zhao Lei <zhaolei@cn.fujitsu.com> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org> Acked-by: Tejun Heo <tj@kernel.org> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Mike Galbraith <efault@gmx.de> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Tejun Heo <htejun@gmail.com> Cc: Thomas Gleixner <tglx@linutronix.de> Link: http://lkml.kernel.org/r/aa171da036b520b51c79549e9b3215d29473f19d.1458635566.git.zhaolei@cn.fujitsu.com Signed-off-by: Ingo Molnar <mingo@kernel.org> --- kernel/sched/cpuacct.c | 140 +++++++++++++++++++++++++++++++++++++++---------- 1 file changed, 113 insertions(+), 27 deletions(-) diff --git a/kernel/sched/cpuacct.c b/kernel/sched/cpuacct.c index e39bd4f..df947e0 100644 --- a/kernel/sched/cpuacct.c +++ b/kernel/sched/cpuacct.c @@ -25,11 +25,22 @@ enum cpuacct_stat_index { CPUACCT_STAT_NSTATS, }; +enum cpuacct_usage_index { + CPUACCT_USAGE_USER, /* ... user mode */ + CPUACCT_USAGE_SYSTEM, /* ... kernel mode */ + + CPUACCT_USAGE_NRUSAGE, +}; + +struct cpuacct_usage { + u64 usages[CPUACCT_USAGE_NRUSAGE]; +}; + /* track cpu usage of a group of tasks and its child groups */ struct cpuacct { struct cgroup_subsys_state css; /* cpuusage holds pointer to a u64-type object on every cpu */ - u64 __percpu *cpuusage; + struct cpuacct_usage __percpu *cpuusage; struct kernel_cpustat __percpu *cpustat; }; @@ -49,7 +60,7 @@ static inline struct cpuacct *parent_ca(struct cpuacct *ca) return css_ca(ca->css.parent); } -static DEFINE_PER_CPU(u64, root_cpuacct_cpuusage); +static DEFINE_PER_CPU(struct cpuacct_usage, root_cpuacct_cpuusage); static struct cpuacct root_cpuacct = { .cpustat = &kernel_cpustat, .cpuusage = &root_cpuacct_cpuusage, @@ -68,7 +79,7 @@ cpuacct_css_alloc(struct cgroup_subsys_state *parent_css) if (!ca) goto out; - ca->cpuusage = alloc_percpu(u64); + ca->cpuusage = alloc_percpu(struct cpuacct_usage); if (!ca->cpuusage) goto out_free_ca; @@ -96,20 +107,37 @@ static void cpuacct_css_free(struct cgroup_subsys_state *css) kfree(ca); } -static u64 cpuacct_cpuusage_read(struct cpuacct *ca, int cpu) +static u64 cpuacct_cpuusage_read(struct cpuacct *ca, int cpu, + enum cpuacct_usage_index index) { - u64 *cpuusage = per_cpu_ptr(ca->cpuusage, cpu); + struct cpuacct_usage *cpuusage = per_cpu_ptr(ca->cpuusage, cpu); u64 data; + /* + * We allow index == CPUACCT_USAGE_NRUSAGE here to read + * the sum of suages. + */ + BUG_ON(index > CPUACCT_USAGE_NRUSAGE); + #ifndef CONFIG_64BIT /* * Take rq->lock to make 64-bit read safe on 32-bit platforms. */ raw_spin_lock_irq(&cpu_rq(cpu)->lock); - data = *cpuusage; +#endif + + if (index == CPUACCT_USAGE_NRUSAGE) { + int i = 0; + + data = 0; + for (i = 0; i < CPUACCT_USAGE_NRUSAGE; i++) + data += cpuusage->usages[i]; + } else { + data = cpuusage->usages[index]; + } + +#ifndef CONFIG_64BIT raw_spin_unlock_irq(&cpu_rq(cpu)->lock); -#else - data = *cpuusage; #endif return data; @@ -117,69 +145,103 @@ static u64 cpuacct_cpuusage_read(struct cpuacct *ca, int cpu) static void cpuacct_cpuusage_write(struct cpuacct *ca, int cpu, u64 val) { - u64 *cpuusage = per_cpu_ptr(ca->cpuusage, cpu); + struct cpuacct_usage *cpuusage = per_cpu_ptr(ca->cpuusage, cpu); + int i; #ifndef CONFIG_64BIT /* * Take rq->lock to make 64-bit write safe on 32-bit platforms. */ raw_spin_lock_irq(&cpu_rq(cpu)->lock); - *cpuusage = val; +#endif + + for (i = 0; i < CPUACCT_USAGE_NRUSAGE; i++) + cpuusage->usages[i] = val; + +#ifndef CONFIG_64BIT raw_spin_unlock_irq(&cpu_rq(cpu)->lock); -#else - *cpuusage = val; #endif } /* return total cpu usage (in nanoseconds) of a group */ -static u64 cpuusage_read(struct cgroup_subsys_state *css, struct cftype *cft) +static u64 __cpuusage_read(struct cgroup_subsys_state *css, + enum cpuacct_usage_index index) { struct cpuacct *ca = css_ca(css); u64 totalcpuusage = 0; int i; for_each_possible_cpu(i) - totalcpuusage += cpuacct_cpuusage_read(ca, i); + totalcpuusage += cpuacct_cpuusage_read(ca, i, index); return totalcpuusage; } +static u64 cpuusage_user_read(struct cgroup_subsys_state *css, + struct cftype *cft) +{ + return __cpuusage_read(css, CPUACCT_USAGE_USER); +} + +static u64 cpuusage_sys_read(struct cgroup_subsys_state *css, + struct cftype *cft) +{ + return __cpuusage_read(css, CPUACCT_USAGE_SYSTEM); +} + +static u64 cpuusage_read(struct cgroup_subsys_state *css, struct cftype *cft) +{ + return __cpuusage_read(css, CPUACCT_USAGE_NRUSAGE); +} + static int cpuusage_write(struct cgroup_subsys_state *css, struct cftype *cft, u64 val) { struct cpuacct *ca = css_ca(css); - int err = 0; - int i; + int cpu; /* * Only allow '0' here to do a reset. */ - if (val) { - err = -EINVAL; - goto out; - } + if (val) + return -EINVAL; - for_each_possible_cpu(i) - cpuacct_cpuusage_write(ca, i, 0); + for_each_possible_cpu(cpu) + cpuacct_cpuusage_write(ca, cpu, 0); -out: - return err; + return 0; } -static int cpuacct_percpu_seq_show(struct seq_file *m, void *V) +static int __cpuacct_percpu_seq_show(struct seq_file *m, + enum cpuacct_usage_index index) { struct cpuacct *ca = css_ca(seq_css(m)); u64 percpu; int i; for_each_possible_cpu(i) { - percpu = cpuacct_cpuusage_read(ca, i); + percpu = cpuacct_cpuusage_read(ca, i, index); seq_printf(m, "%llu ", (unsigned long long) percpu); } seq_printf(m, "\n"); return 0; } +static int cpuacct_percpu_user_seq_show(struct seq_file *m, void *V) +{ + return __cpuacct_percpu_seq_show(m, CPUACCT_USAGE_USER); +} + +static int cpuacct_percpu_sys_seq_show(struct seq_file *m, void *V) +{ + return __cpuacct_percpu_seq_show(m, CPUACCT_USAGE_SYSTEM); +} + +static int cpuacct_percpu_seq_show(struct seq_file *m, void *V) +{ + return __cpuacct_percpu_seq_show(m, CPUACCT_USAGE_NRUSAGE); +} + static const char * const cpuacct_stat_desc[] = { [CPUACCT_STAT_USER] = "user", [CPUACCT_STAT_SYSTEM] = "system", @@ -220,10 +282,26 @@ static struct cftype files[] = { .write_u64 = cpuusage_write, }, { + .name = "usage_user", + .read_u64 = cpuusage_user_read, + }, + { + .name = "usage_sys", + .read_u64 = cpuusage_sys_read, + }, + { .name = "usage_percpu", .seq_show = cpuacct_percpu_seq_show, }, { + .name = "usage_percpu_user", + .seq_show = cpuacct_percpu_user_seq_show, + }, + { + .name = "usage_percpu_sys", + .seq_show = cpuacct_percpu_sys_seq_show, + }, + { .name = "stat", .seq_show = cpuacct_stats_show, }, @@ -238,10 +316,18 @@ static struct cftype files[] = { void cpuacct_charge(struct task_struct *tsk, u64 cputime) { struct cpuacct *ca; + int index; + + if (user_mode(task_pt_regs(tsk))) + index = CPUACCT_USAGE_USER; + else + index = CPUACCT_USAGE_SYSTEM; rcu_read_lock(); + for (ca = task_ca(tsk); ca; ca = parent_ca(ca)) - *this_cpu_ptr(ca->cpuusage) += cputime; + this_cpu_ptr(ca->cpuusage)->usages[index] += cputime; + rcu_read_unlock(); } ^ permalink raw reply related [flat|nested] 23+ messages in thread
* Re: [tip:sched/core] sched/cpuacct: Split usage accounting into user_usage and sys_usage 2016-03-31 9:27 ` [tip:sched/core] sched/cpuacct: Split usage accounting " tip-bot for Dongsheng Yang @ 2016-04-04 14:03 ` Srikar Dronamraju 2016-04-06 6:54 ` Ingo Molnar 0 siblings, 1 reply; 23+ messages in thread From: Srikar Dronamraju @ 2016-04-04 14:03 UTC (permalink / raw) To: tglx, efault, htejun, linux-kernel, tj, torvalds, zhaolei, peterz, mingo, yangds.fnst, hpa * tip-bot for Dongsheng Yang <tipbot@zytor.com> [2016-03-31 02:27:39]: > Commit-ID: d740037fac7052e49450f6fa1454f1144a103b55 > Gitweb: http://git.kernel.org/tip/d740037fac7052e49450f6fa1454f1144a103b55 > Author: Dongsheng Yang <yangds.fnst@cn.fujitsu.com> > AuthorDate: Tue, 22 Mar 2016 16:37:08 +0800 > Committer: Ingo Molnar <mingo@kernel.org> > CommitDate: Thu, 31 Mar 2016 10:48:54 +0200 > > sched/cpuacct: Split usage accounting into user_usage and sys_usage > > Sometimes, cpuacct.usage is not detailed enough to see how much CPU > usage a group had. We want to know how much time it used in user mode > and how much in kernel mode. > Unfortunately this commit causes boot to fail on my power 7 box. > @@ -238,10 +316,18 @@ static struct cftype files[] = { > void cpuacct_charge(struct task_struct *tsk, u64 cputime) > { > struct cpuacct *ca; > + int index; > + > + if (user_mode(task_pt_regs(tsk))) > + index = CPUACCT_USAGE_USER; > + else > + index = CPUACCT_USAGE_SYSTEM; > > rcu_read_lock(); > + > for (ca = task_ca(tsk); ca; ca = parent_ca(ca)) > - *this_cpu_ptr(ca->cpuusage) += cputime; > + this_cpu_ptr(ca->cpuusage)->usages[index] += cputime; The above line seems to be the cause of oops. Attached the complete console log below. Unable to handle kernel paging request for data at address 0x00000108 Faulting instruction address: 0xc0000000000f8fbc cpu 0x0: Vector: 300 (Data Access) at [c000000f1f607720] pc: c0000000000f8fbc: .cpuacct_charge+0x2c/0x110 lr: c0000000000dd184: .update_curr+0xa4/0x1d0 sp: c000000f1f6079a0 msr: 9000000002009032 dar: 108 dsisr: 40000000 current = 0xc000000f1f582480 paca = 0xc00000000fe00000 softe: 0 irq_happened: 0x01 pid = 2, comm = kthreadd Linux version 4.6.0-rc1-master+ (srikar@llmjuno) (gcc version 4.8.2 20131212 (Red Hat 4.8.2-7) (GCC) ) #38 SMP Tue Apr 5 00:20:26 IST 2016 enter ? for help [c000000f1f607a30] c0000000000dd184 .update_curr+0xa4/0x1d0 [c000000f1f607ad0] c0000000000dfc2c .dequeue_task_fair+0xbc/0x1110 [c000000f1f607be0] c0000000000d3c70 .do_set_cpus_allowed+0x1b0/0x1c0 [c000000f1f607c70] c0000000000d4424 .__set_cpus_allowed_ptr+0x194/0x2e0 [c000000f1f607d50] c0000000000c70b8 .kthreadd+0x68/0x2c0 [c000000f1f607e30] c0000000000095e4 .ret_from_kernel_thread+0x58/0x74 git bisect shows this as the commit that causes the problem. I verifed by booting with the tip + revert of this commit. > + > rcu_read_unlock(); > } > -- Thanks and Regards Srikar Dronamraju phys_mem_size = 0x1000000000 cpu_features = 0x0b7e7ae518500049 possible = 0x3fffffff18500649 always = 0x0000000018100040 cpu_user_features = 0xdc0065c7 0x20000000 mmu_features = 0x7c000003 firmware_features = 0x0000000010000000 htab_address = 0xc000000ffc000000 htab_hash_mask = 0x7ffff ----------------------------------------------------- <- setup_system() Linux version 4.6.0-rc1-master+ (srikar@llmjuno) (gcc version 4.8.2 20131212 (Red Hat 4.8.2-7) (GCC) ) #38 SMP Tue Apr 5 00:20:26 IST 2016 Node 0 Memory: Node 1 Memory: 0x0-0x1000000000 numa: Initmem setup node 0 numa: NODE_DATA [mem 0xf2ed9e100-0xf2eda7fff] numa: NODE_DATA(0) on node 1 numa: Initmem setup node 1 [mem 0x00000000-0xfffffffff] numa: NODE_DATA [mem 0xf2ed94200-0xf2ed9e0ff] Section 3884 and 3886 (node 1) have a circular dependency on usemap and pgdat allocations Probing IODA IO-Hub /io-hub@3efe00000000 Initializing IODA0 OPAL PHB /io-hub@3efe00000000/pciex@3efe00080000 PCI host bridge /io-hub@3efe00000000/pciex@3efe00080000 (primary) ranges: IO 0x00003efe01000000..0x00003efe017fffff -> 0x0000000000000000 MEM 0x00003da080000000..0x00003da0fffeffff -> 0x0000000080000000 Not support M64 window 128 (127) PE's M32: 0x80000000 [segment=0x1000000] IO: 0x800000 [segment=0x10000] Allocated bitmap for 256 MSIs (base IRQ 0x600) Initializing IODA0 OPAL PHB /io-hub@3efe00000000/pciex@3efe00090000 PCI host bridge /io-hub@3efe00000000/pciex@3efe00090000 ranges: IO 0x00003efe01800000..0x00003efe01ffffff -> 0x0000000000000000 MEM 0x00003da180000000..0x00003da1fffeffff -> 0x0000000080000000 Not support M64 window 128 (127) PE's M32: 0x80000000 [segment=0x1000000] IO: 0x800000 [segment=0x10000] Allocated bitmap for 256 MSIs (base IRQ 0xa00) Initializing IODA0 OPAL PHB /io-hub@3efe00000000/pciex@3efe000a0000 PCI host bridge /io-hub@3efe00000000/pciex@3efe000a0000 ranges: IO 0x00003efe02000000..0x00003efe027fffff -> 0x0000000000000000 MEM 0x00003da280000000..0x00003da2fffeffff -> 0x0000000080000000 Not support M64 window 128 (127) PE's M32: 0x80000000 [segment=0x1000000] IO: 0x800000 [segment=0x10000] Allocated bitmap for 256 MSIs (base IRQ 0xe00) Initializing IODA0 OPAL PHB /io-hub@3efe00000000/pciex@3efe000b0000 PCI host bridge /io-hub@3efe00000000/pciex@3efe000b0000 ranges: IO 0x00003efe02800000..0x00003efe02ffffff -> 0x0000000000000000 MEM 0x00003da380000000..0x00003da3fffeffff -> 0x0000000080000000 Not support M64 window 128 (127) PE's M32: 0x80000000 [segment=0x1000000] IO: 0x800000 [segment=0x10000] Allocated bitmap for 256 MSIs (base IRQ 0x1200) Initializing IODA0 OPAL PHB /io-hub@3efe00000000/pciex@3efe000c0000 PCI host bridge /io-hub@3efe00000000/pciex@3efe000c0000 ranges: IO 0x00003efe03000000..0x00003efe037fffff -> 0x0000000000000000 MEM 0x00003da480000000..0x00003da4fffeffff -> 0x0000000080000000 Not support M64 window 128 (127) PE's M32: 0x80000000 [segment=0x1000000] IO: 0x800000 [segment=0x10000] Allocated bitmap for 256 MSIs (base IRQ 0x1600) Initializing IODA0 OPAL PHB /io-hub@3efe00000000/pciex@3efe000d0000 PCI host bridge /io-hub@3efe00000000/pciex@3efe000d0000 ranges: IO 0x00003efe03800000..0x00003efe03ffffff -> 0x0000000000000000 MEM 0x00003da580000000..0x00003da5fffeffff -> 0x0000000080000000 Not support M64 window 128 (127) PE's M32: 0x80000000 [segment=0x1000000] IO: 0x800000 [segment=0x10000] Allocated bitmap for 256 MSIs (base IRQ 0x1a00) OPAL nvram setup, 1048576 bytes Top of RAM: 0x1000000000, Total RAM: 0x1000000000 Memory hole size: 0MB Zone ranges: DMA [mem 0x0000000000000000-0x0000000fffffffff] DMA32 empty Normal empty Movable zone start for each node Early memory node ranges node 1: [mem 0x0000000000000000-0x0000000fffffffff] Could not find start_pfn for node 0 Initmem setup node 0 [mem 0x0000000000000000-0x0000000000000000] On node 0 totalpages: 0 Initmem setup node 1 [mem 0x0000000000000000-0x0000000fffffffff] On node 1 totalpages: 1048576 DMA zone: 1024 pages used for memmap DMA zone: 0 pages reserved DMA zone: 1048576 pages, LIFO batch:1 percpu: Embedded 3 pages/cpu @c000000f2dc00000 s133656 r0 d62952 u262144 pcpu-alloc: s133656 r0 d62952 u262144 alloc=1*1048576 pcpu-alloc: [0] 00 01 02 03 [0] 04 05 06 07 pcpu-alloc: [0] 08 09 10 11 [0] 12 13 14 15 pcpu-alloc: [0] 16 17 18 19 [0] 20 21 22 23 pcpu-alloc: [0] 24 25 26 27 [0] 28 29 30 31 pcpu-alloc: [0] 32 33 34 35 [0] 36 37 38 39 pcpu-alloc: [0] 40 41 42 43 [0] 44 45 46 47 pcpu-alloc: [0] 48 49 50 51 [0] 52 53 54 55 pcpu-alloc: [0] 56 57 58 59 [0] 60 61 62 63 Built 2 zonelists in Node order, mobility grouping on. Total pages: 1047552 Policy zone: DMA Kernel command line: BOOT_IMAGE=/boot/vmlinux-4.6.0-rc1-master+ root=/dev/mapper/fedora_llmjuno03b-root ro rd.md=0 rd.dm=0 vconsole.keymap=us rd.luks=0 vconsole.font=latarcyrheb-sun16 rd.lvm.lv=fedora_llmjuno03b/swap rd.lvm.lv=fedora_llmjuno03b/root nmi_watchdog=0 nohpet ignore_loglevel log_buf_len=10M print_fatal_signals=1 loglevel=8 panic=30 log_buf_len: 16777216 bytes early log buf free: 122804(93%) PID hash table entries: 4096 (order: -1, 32768 bytes) Sorting __ex_table... Memory: 63497024K/67108864K available (9152K kernel code, 1728K rwdata, 3160K rodata, 832K init, 1040K bss, 253120K reserved, 3358720K cma-reserved) SLUB: HWalign=128, Order=0-3, MinObjects=0, CPUs=64, Nodes=2 Hierarchical RCU implementation. Build-time adjustment of leaf fanout to 64. RCU restricting CPUs from NR_CPUS=2048 to nr_cpu_ids=64. RCU: Adjusting geometry for rcu_fanout_leaf=64, nr_cpu_ids=64 NR_IRQS:512 nr_irqs:512 16 ICS OPAL backend registered time_init: decrementer frequency = 512.000000 MHz time_init: processor frequency = 3300.000000 MHz clocksource: timebase: mask: 0xffffffffffffffff max_cycles: 0x761537d007, max_idle_ns: 440795202126 ns clocksource: timebase mult[1f40000] shift[24] registered clockevent: decrementer mult[83126e98] shift[32] cpu[0] Console: colour dummy device 80x25 console [hvc0] enabled console [hvc0] enabled bootconsole [udbg0] disabled bootconsole [udbg0] disabled mempolicy: Enabling automatic NUMA balancing. Configure with numa_balancing= or the kernel.numa_balancing sysctl pid_max: default: 65536 minimum: 512 Dentry cache hash table entries: 8388608 (order: 10, 67108864 bytes) Inode-cache hash table entries: 4194304 (order: 9, 33554432 bytes) Mount-cache hash table entries: 131072 (order: 4, 1048576 bytes) Mountpoint-cache hash table entries: 131072 (order: 4, 1048576 bytes) Unable to handle kernel paging request for data at address 0x00000108 Faulting instruction address: 0xc0000000000f8fbc cpu 0x0: Vector: 300 (Data Access) at [c000000f1f607720] pc: c0000000000f8fbc: .cpuacct_charge+0x2c/0x110 lr: c0000000000dd184: .update_curr+0xa4/0x1d0 sp: c000000f1f6079a0 msr: 9000000002009032 dar: 108 dsisr: 40000000 current = 0xc000000f1f582480 paca = 0xc00000000fe00000 softe: 0 irq_happened: 0x01 pid = 2, comm = kthreadd Linux version 4.6.0-rc1-master+ (srikar@llmjuno) (gcc version 4.8.2 20131212 (Red Hat 4.8.2-7) (GCC) ) #38 SMP Tue Apr 5 00:20:26 IST 2016 enter ? for help [c000000f1f607a30] c0000000000dd184 .update_curr+0xa4/0x1d0 [c000000f1f607ad0] c0000000000dfc2c .dequeue_task_fair+0xbc/0x1110 [c000000f1f607be0] c0000000000d3c70 .do_set_cpus_allowed+0x1b0/0x1c0 [c000000f1f607c70] c0000000000d4424 .__set_cpus_allowed_ptr+0x194/0x2e0 [c000000f1f607d50] c0000000000c70b8 .kthreadd+0x68/0x2c0 [c000000f1f607e30] c0000000000095e4 .ret_from_kernel_thread+0x58/0x74 ^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: [tip:sched/core] sched/cpuacct: Split usage accounting into user_usage and sys_usage 2016-04-04 14:03 ` Srikar Dronamraju @ 2016-04-06 6:54 ` Ingo Molnar 2016-04-06 7:47 ` Zhao Lei 2016-04-06 10:32 ` Anton Blanchard 0 siblings, 2 replies; 23+ messages in thread From: Ingo Molnar @ 2016-04-06 6:54 UTC (permalink / raw) To: Srikar Dronamraju Cc: tglx, efault, htejun, linux-kernel, tj, torvalds, zhaolei, peterz, yangds.fnst, hpa * Srikar Dronamraju <srikar@linux.vnet.ibm.com> wrote: > * tip-bot for Dongsheng Yang <tipbot@zytor.com> [2016-03-31 02:27:39]: > > > Commit-ID: d740037fac7052e49450f6fa1454f1144a103b55 > > Gitweb: http://git.kernel.org/tip/d740037fac7052e49450f6fa1454f1144a103b55 > > Author: Dongsheng Yang <yangds.fnst@cn.fujitsu.com> > > AuthorDate: Tue, 22 Mar 2016 16:37:08 +0800 > > Committer: Ingo Molnar <mingo@kernel.org> > > CommitDate: Thu, 31 Mar 2016 10:48:54 +0200 > > > > sched/cpuacct: Split usage accounting into user_usage and sys_usage > > > > Sometimes, cpuacct.usage is not detailed enough to see how much CPU > > usage a group had. We want to know how much time it used in user mode > > and how much in kernel mode. > > > > Unfortunately this commit causes boot to fail on my power 7 box. > > > @@ -238,10 +316,18 @@ static struct cftype files[] = { > > void cpuacct_charge(struct task_struct *tsk, u64 cputime) > > { > > struct cpuacct *ca; > > + int index; > > + > > + if (user_mode(task_pt_regs(tsk))) > > + index = CPUACCT_USAGE_USER; > > + else > > + index = CPUACCT_USAGE_SYSTEM; > > > > rcu_read_lock(); > > + > > for (ca = task_ca(tsk); ca; ca = parent_ca(ca)) > > - *this_cpu_ptr(ca->cpuusage) += cputime; > > + this_cpu_ptr(ca->cpuusage)->usages[index] += cputime; > > The above line seems to be the cause of oops. Attached the complete console log below. Weird - not much changed wrt. the cpuusage logic, we only increased its size. If you change the above loop to something like: for (ca = task_ca(tsk); ca; ca = parent_ca(ca)) { if (WARN_ON_ONCE(!ca->cpuusage)) continue; this_cpu_ptr(ca->cpuusage)->usages[index] += cputime; } then do you get the warning and the bootup succeeds? Thanks, Ingo ^ permalink raw reply [flat|nested] 23+ messages in thread
* RE: [tip:sched/core] sched/cpuacct: Split usage accounting into user_usage and sys_usage 2016-04-06 6:54 ` Ingo Molnar @ 2016-04-06 7:47 ` Zhao Lei 2016-04-06 7:50 ` Peter Zijlstra 2016-04-06 10:32 ` Anton Blanchard 1 sibling, 1 reply; 23+ messages in thread From: Zhao Lei @ 2016-04-06 7:47 UTC (permalink / raw) To: 'Ingo Molnar', 'Srikar Dronamraju' Cc: tglx, efault, htejun, linux-kernel, tj, torvalds, peterz, yangds.fnst, hpa Hi, Ingo and Srikar, > -----Original Message----- > From: Ingo Molnar [mailto:mingo.kernel.org@gmail.com] On Behalf Of Ingo > Molnar > Sent: Wednesday, April 06, 2016 2:55 PM > To: Srikar Dronamraju <srikar@linux.vnet.ibm.com> > Cc: tglx@linutronix.de; efault@gmx.de; htejun@gmail.com; > linux-kernel@vger.kernel.org; tj@kernel.org; torvalds@linux-foundation.org; > zhaolei@cn.fujitsu.com; peterz@infradead.org; yangds.fnst@cn.fujitsu.com; > hpa@zytor.com > Subject: Re: [tip:sched/core] sched/cpuacct: Split usage accounting into > user_usage and sys_usage > > > * Srikar Dronamraju <srikar@linux.vnet.ibm.com> wrote: > > > * tip-bot for Dongsheng Yang <tipbot@zytor.com> [2016-03-31 02:27:39]: > > > > > Commit-ID: d740037fac7052e49450f6fa1454f1144a103b55 > > > Gitweb: > http://git.kernel.org/tip/d740037fac7052e49450f6fa1454f1144a103b55 > > > Author: Dongsheng Yang <yangds.fnst@cn.fujitsu.com> > > > AuthorDate: Tue, 22 Mar 2016 16:37:08 +0800 > > > Committer: Ingo Molnar <mingo@kernel.org> > > > CommitDate: Thu, 31 Mar 2016 10:48:54 +0200 > > > > > > sched/cpuacct: Split usage accounting into user_usage and sys_usage > > > > > > Sometimes, cpuacct.usage is not detailed enough to see how much CPU > > > usage a group had. We want to know how much time it used in user mode > > > and how much in kernel mode. > > > > > > > Unfortunately this commit causes boot to fail on my power 7 box. > > > > > @@ -238,10 +316,18 @@ static struct cftype files[] = { > > > void cpuacct_charge(struct task_struct *tsk, u64 cputime) > > > { > > > struct cpuacct *ca; > > > + int index; > > > + > > > + if (user_mode(task_pt_regs(tsk))) > > > + index = CPUACCT_USAGE_USER; > > > + else > > > + index = CPUACCT_USAGE_SYSTEM; > > > > > > rcu_read_lock(); > > > + > > > for (ca = task_ca(tsk); ca; ca = parent_ca(ca)) > > > - *this_cpu_ptr(ca->cpuusage) += cputime; > > > + this_cpu_ptr(ca->cpuusage)->usages[index] += cputime; > > > > The above line seems to be the cause of oops. Attached the complete console > log below. > Thanks for reporting this bug. I'll take over this patch. > Weird - not much changed wrt. the cpuusage logic, we only increased its size. > Thanks for concern. I trying to review this patch, but had not found something strange, except: > If you change the above loop to something like: > > for (ca = task_ca(tsk); ca; ca = parent_ca(ca)) { > if (WARN_ON_ONCE(!ca->cpuusage)) > continue; > this_cpu_ptr(ca->cpuusage)->usages[index] += cputime; Or s/ this_cpu_ptr/get_cpu_var to avoid preempt? But preempt problem should not cause problem in every boot, And Srikar's bisect means it failed in every boot. > } > > then do you get the warning and the bootup succeeds? > I also building a power7 vm for test, still in building... Thanks Zhaolei ^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: [tip:sched/core] sched/cpuacct: Split usage accounting into user_usage and sys_usage 2016-04-06 7:47 ` Zhao Lei @ 2016-04-06 7:50 ` Peter Zijlstra 0 siblings, 0 replies; 23+ messages in thread From: Peter Zijlstra @ 2016-04-06 7:50 UTC (permalink / raw) To: Zhao Lei Cc: 'Ingo Molnar', 'Srikar Dronamraju', tglx, efault, htejun, linux-kernel, tj, torvalds, yangds.fnst, hpa On Wed, Apr 06, 2016 at 03:47:18PM +0800, Zhao Lei wrote: > > If you change the above loop to something like: > > > > for (ca = task_ca(tsk); ca; ca = parent_ca(ca)) { > > if (WARN_ON_ONCE(!ca->cpuusage)) > > continue; > > this_cpu_ptr(ca->cpuusage)->usages[index] += cputime; > > Or s/ this_cpu_ptr/get_cpu_var to avoid preempt? This is all called with rq->lock held, preemption is firmly disabled. ^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: [tip:sched/core] sched/cpuacct: Split usage accounting into user_usage and sys_usage 2016-04-06 6:54 ` Ingo Molnar 2016-04-06 7:47 ` Zhao Lei @ 2016-04-06 10:32 ` Anton Blanchard 2016-04-06 11:08 ` Peter Zijlstra 1 sibling, 1 reply; 23+ messages in thread From: Anton Blanchard @ 2016-04-06 10:32 UTC (permalink / raw) To: Ingo Molnar Cc: Srikar Dronamraju, tglx, efault, htejun, linux-kernel, tj, torvalds, zhaolei, peterz, yangds.fnst, hpa, Stephen Rothwell, Michael Ellerman Hi, > > > void cpuacct_charge(struct task_struct *tsk, u64 cputime) > > > { > > > struct cpuacct *ca; > > > + int index; > > > + > > > + if (user_mode(task_pt_regs(tsk))) > > > + index = CPUACCT_USAGE_USER; > > > + else > > > + index = CPUACCT_USAGE_SYSTEM; This is oopsing because PowerPC task_pt_regs() returns NULL for kernel threads. Anton ^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: [tip:sched/core] sched/cpuacct: Split usage accounting into user_usage and sys_usage 2016-04-06 10:32 ` Anton Blanchard @ 2016-04-06 11:08 ` Peter Zijlstra 2016-04-06 11:43 ` Zhao Lei 2016-04-06 11:59 ` [PATCH] sched/cpuacct: Check for NULL when using task_pt_regs() Anton Blanchard 0 siblings, 2 replies; 23+ messages in thread From: Peter Zijlstra @ 2016-04-06 11:08 UTC (permalink / raw) To: Anton Blanchard Cc: Ingo Molnar, Srikar Dronamraju, tglx, efault, htejun, linux-kernel, tj, torvalds, zhaolei, yangds.fnst, hpa, Stephen Rothwell, Michael Ellerman On Wed, Apr 06, 2016 at 08:32:19PM +1000, Anton Blanchard wrote: > Hi, > > > > > void cpuacct_charge(struct task_struct *tsk, u64 cputime) > > > > { > > > > struct cpuacct *ca; > > > > + int index; > > > > + > > > > + if (user_mode(task_pt_regs(tsk))) > > > > + index = CPUACCT_USAGE_USER; > > > > + else > > > > + index = CPUACCT_USAGE_SYSTEM; > > This is oopsing because PowerPC task_pt_regs() returns NULL for > kernel threads. Ah, so sometihng like: struct pt_regs *regs = task_pt_regs(); int index = CPUACCT_USAGE_SYSTEM; if (regs && user_mode(regs)) index = CPUACCT_USAGE_USER; should work, right? ^ permalink raw reply [flat|nested] 23+ messages in thread
* RE: [tip:sched/core] sched/cpuacct: Split usage accounting into user_usage and sys_usage 2016-04-06 11:08 ` Peter Zijlstra @ 2016-04-06 11:43 ` Zhao Lei 2016-04-06 11:59 ` [PATCH] sched/cpuacct: Check for NULL when using task_pt_regs() Anton Blanchard 1 sibling, 0 replies; 23+ messages in thread From: Zhao Lei @ 2016-04-06 11:43 UTC (permalink / raw) To: 'Peter Zijlstra', 'Anton Blanchard' Cc: 'Ingo Molnar', 'Srikar Dronamraju', tglx, efault, htejun, linux-kernel, tj, torvalds, yangds.fnst, hpa, 'Stephen Rothwell', 'Michael Ellerman' Hi, Anton and Peter > -----Original Message----- > From: Peter Zijlstra [mailto:peterz@infradead.org] > Sent: Wednesday, April 06, 2016 7:08 PM > To: Anton Blanchard <anton@samba.org> > Cc: Ingo Molnar <mingo@kernel.org>; Srikar Dronamraju > <srikar@linux.vnet.ibm.com>; tglx@linutronix.de; efault@gmx.de; > htejun@gmail.com; linux-kernel@vger.kernel.org; tj@kernel.org; > torvalds@linux-foundation.org; zhaolei@cn.fujitsu.com; > yangds.fnst@cn.fujitsu.com; hpa@zytor.com; Stephen Rothwell > <sfr@canb.auug.org.au>; Michael Ellerman <mpe@ellerman.id.au> > Subject: Re: [tip:sched/core] sched/cpuacct: Split usage accounting into > user_usage and sys_usage > > On Wed, Apr 06, 2016 at 08:32:19PM +1000, Anton Blanchard wrote: > > Hi, > > > > > > > void cpuacct_charge(struct task_struct *tsk, u64 cputime) > > > > > { > > > > > struct cpuacct *ca; > > > > > + int index; > > > > > + > > > > > + if (user_mode(task_pt_regs(tsk))) > > > > > + index = CPUACCT_USAGE_USER; > > > > > + else > > > > > + index = CPUACCT_USAGE_SYSTEM; > > > > This is oopsing because PowerPC task_pt_regs() returns NULL for > > kernel threads. > Thanks for notice it. > Ah, so sometihng like: > > struct pt_regs *regs = task_pt_regs(); > int index = CPUACCT_USAGE_SYSTEM; > > if (regs && user_mode(regs)) > index = CPUACCT_USAGE_USER; > > should work, right? > Thanks, I'll confirm it in a Power7 VM from buildroot. (maybe need some hours or till tomorrow in a poor pc...) Btw, I reproduced this bug in above vm. Thanks Zhaolei > ^ permalink raw reply [flat|nested] 23+ messages in thread
* [PATCH] sched/cpuacct: Check for NULL when using task_pt_regs() 2016-04-06 11:08 ` Peter Zijlstra 2016-04-06 11:43 ` Zhao Lei @ 2016-04-06 11:59 ` Anton Blanchard 2016-04-06 13:26 ` Srikar Dronamraju ` (2 more replies) 1 sibling, 3 replies; 23+ messages in thread From: Anton Blanchard @ 2016-04-06 11:59 UTC (permalink / raw) To: Peter Zijlstra Cc: Ingo Molnar, Srikar Dronamraju, tglx, efault, htejun, linux-kernel, tj, torvalds, zhaolei, yangds.fnst, hpa, Stephen Rothwell, Michael Ellerman, linuxppc-dev Hi Peter, > Ah, so sometihng like: > > struct pt_regs *regs = task_pt_regs(); > int index = CPUACCT_USAGE_SYSTEM; > > if (regs && user_mode(regs)) > index = CPUACCT_USAGE_USER; > > should work, right? Looks good, and the patch below does fix the oops for me. Anton -- task_pt_regs() can return NULL for kernel threads, so add a check. This fixes an oops at boot on ppc64. Signed-off-by: Anton Blanchard <anton@samba.org> --- diff --git a/kernel/sched/cpuacct.c b/kernel/sched/cpuacct.c index df947e0..41f85c4 100644 --- a/kernel/sched/cpuacct.c +++ b/kernel/sched/cpuacct.c @@ -316,12 +316,11 @@ static struct cftype files[] = { void cpuacct_charge(struct task_struct *tsk, u64 cputime) { struct cpuacct *ca; - int index; + int index = CPUACCT_USAGE_SYSTEM; + struct pt_regs *regs = task_pt_regs(tsk); - if (user_mode(task_pt_regs(tsk))) + if (regs && user_mode(regs)) index = CPUACCT_USAGE_USER; - else - index = CPUACCT_USAGE_SYSTEM; rcu_read_lock(); ^ permalink raw reply related [flat|nested] 23+ messages in thread
* Re: [PATCH] sched/cpuacct: Check for NULL when using task_pt_regs() 2016-04-06 11:59 ` [PATCH] sched/cpuacct: Check for NULL when using task_pt_regs() Anton Blanchard @ 2016-04-06 13:26 ` Srikar Dronamraju 2016-04-06 17:05 ` Zhao Lei 2016-04-13 7:43 ` Ingo Molnar 2016-04-11 6:16 ` Ping? " Michael Ellerman 2016-04-13 11:49 ` [tip:sched/core] " tip-bot for Anton Blanchard 2 siblings, 2 replies; 23+ messages in thread From: Srikar Dronamraju @ 2016-04-06 13:26 UTC (permalink / raw) To: Anton Blanchard Cc: Peter Zijlstra, Ingo Molnar, tglx, efault, htejun, linux-kernel, tj, torvalds, zhaolei, yangds.fnst, hpa, Stephen Rothwell, Michael Ellerman, linuxppc-dev * Anton Blanchard <anton@samba.org> [2016-04-06 21:59:50]: > Looks good, and the patch below does fix the oops for me. > > Anton > -- > > task_pt_regs() can return NULL for kernel threads, so add a check. > This fixes an oops at boot on ppc64. > > Signed-off-by: Anton Blanchard <anton@samba.org> Works for me too. Reported-and-Tested-by: Srikar Dronamraju <srikar@linux.vnet.ibm.com> -- Thanks and Regards Srikar Dronamraju ^ permalink raw reply [flat|nested] 23+ messages in thread
* RE: [PATCH] sched/cpuacct: Check for NULL when using task_pt_regs() 2016-04-06 13:26 ` Srikar Dronamraju @ 2016-04-06 17:05 ` Zhao Lei 2016-04-13 7:43 ` Ingo Molnar 1 sibling, 0 replies; 23+ messages in thread From: Zhao Lei @ 2016-04-06 17:05 UTC (permalink / raw) To: 'Srikar Dronamraju', 'Anton Blanchard' Cc: 'Peter Zijlstra', 'Ingo Molnar', tglx, efault, htejun, linux-kernel, tj, torvalds, yangds.fnst, hpa, 'Stephen Rothwell', 'Michael Ellerman', linuxppc-dev > -----Original Message----- > From: Srikar Dronamraju [mailto:srikar@linux.vnet.ibm.com] > Sent: Wednesday, April 06, 2016 9:27 PM > To: Anton Blanchard <anton@samba.org> > Cc: Peter Zijlstra <peterz@infradead.org>; Ingo Molnar <mingo@kernel.org>; > tglx@linutronix.de; efault@gmx.de; htejun@gmail.com; > linux-kernel@vger.kernel.org; tj@kernel.org; torvalds@linux-foundation.org; > zhaolei@cn.fujitsu.com; yangds.fnst@cn.fujitsu.com; hpa@zytor.com; Stephen > Rothwell <sfr@canb.auug.org.au>; Michael Ellerman <mpe@ellerman.id.au>; > linuxppc-dev@lists.ozlabs.org > Subject: Re: [PATCH] sched/cpuacct: Check for NULL when using task_pt_regs() > > * Anton Blanchard <anton@samba.org> [2016-04-06 21:59:50]: > > > Looks good, and the patch below does fix the oops for me. > > > > Anton > > -- > > > > task_pt_regs() can return NULL for kernel threads, so add a check. > > This fixes an oops at boot on ppc64. > > > > Signed-off-by: Anton Blanchard <anton@samba.org> > > Works for me too. > > Reported-and-Tested-by: Srikar Dronamraju <srikar@linux.vnet.ibm.com> > Thanks, all I tested it in the vm(can not boot before fix): [single cpu] # for f in cpuacct.*; do echo "$f: "; cat $f; done cpuacct.stat: user 260 system 451 cpuacct.usage: 6923456654 cpuacct.usage_percpu: 6963446178 cpuacct.usage_percpu_sys: 2610934362 cpuacct.usage_percpu_user: 4451667140 cpuacct.usage_sys: 2612684054 cpuacct.usage_user: 4540275322 # [2 cpu wih 8 maxcpus] # for f in cpuacct.*; do echo "$f: "; cat $f; done cpuacct.stat: user 205 system 536 cpuacct.usage: 7293688020 cpuacct.usage_percpu: 3785674990 3551323200 0 0 0 0 0 0 cpuacct.usage_percpu_sys: 2227281124 1635060584 0 0 0 0 0 0 cpuacct.usage_percpu_user: 1567487176 1992278818 0 0 0 0 0 0 cpuacct.usage_sys: 3863445982 cpuacct.usage_user: 3643874038 # # cat /proc/cpuinfo processor : 0 cpu : POWER7 (raw), altivec supported clock : 1000.000000MHz revision : 2.3 (pvr 003f 0203) processor : 1 cpu : POWER7 (raw), altivec supported clock : 1000.000000MHz revision : 2.3 (pvr 003f 0203) timebase : 512000000 platform : pSeries model : IBM pSeries (emulated by qemu) machine : CHRP IBM pSeries (emulated by qemu) # Thanks Zhaolei > -- > Thanks and Regards > Srikar Dronamraju > ^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: [PATCH] sched/cpuacct: Check for NULL when using task_pt_regs() 2016-04-06 13:26 ` Srikar Dronamraju 2016-04-06 17:05 ` Zhao Lei @ 2016-04-13 7:43 ` Ingo Molnar 2016-04-13 11:01 ` Michael Ellerman 1 sibling, 1 reply; 23+ messages in thread From: Ingo Molnar @ 2016-04-13 7:43 UTC (permalink / raw) To: Srikar Dronamraju Cc: Anton Blanchard, Peter Zijlstra, tglx, efault, htejun, linux-kernel, tj, torvalds, zhaolei, yangds.fnst, hpa, Stephen Rothwell, Michael Ellerman, linuxppc-dev * Srikar Dronamraju <srikar@linux.vnet.ibm.com> wrote: > * Anton Blanchard <anton@samba.org> [2016-04-06 21:59:50]: > > > Looks good, and the patch below does fix the oops for me. > > > > Anton > > -- > > > > task_pt_regs() can return NULL for kernel threads, so add a check. > > This fixes an oops at boot on ppc64. > > > > Signed-off-by: Anton Blanchard <anton@samba.org> > > Works for me too. > > Reported-and-Tested-by: Srikar Dronamraju <srikar@linux.vnet.ibm.com> Could someone please re-send the fix, because it has not reached me nor lkml. Thanks, Ingo ^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: [PATCH] sched/cpuacct: Check for NULL when using task_pt_regs() 2016-04-13 7:43 ` Ingo Molnar @ 2016-04-13 11:01 ` Michael Ellerman 2016-04-13 11:21 ` Ingo Molnar 0 siblings, 1 reply; 23+ messages in thread From: Michael Ellerman @ 2016-04-13 11:01 UTC (permalink / raw) To: Ingo Molnar, Srikar Dronamraju Cc: Anton Blanchard, Peter Zijlstra, tglx, efault, htejun, linux-kernel, tj, torvalds, zhaolei, yangds.fnst, hpa, Stephen Rothwell, linuxppc-dev On Wed, 2016-04-13 at 09:43 +0200, Ingo Molnar wrote: > * Srikar Dronamraju <srikar@linux.vnet.ibm.com> wrote: > > > * Anton Blanchard <anton@samba.org> [2016-04-06 21:59:50]: > > > > > Looks good, and the patch below does fix the oops for me. > > > > > > Anton > > > -- > > > > > > task_pt_regs() can return NULL for kernel threads, so add a check. > > > This fixes an oops at boot on ppc64. > > > > > > Signed-off-by: Anton Blanchard <anton@samba.org> > > > > Works for me too. > > > > Reported-and-Tested-by: Srikar Dronamraju <srikar@linux.vnet.ibm.com> > > Could someone please re-send the fix, because it has not reached me nor lkml. It did hit LKML: http://lkml.kernel.org/r/20160406215950.04bc3f0b@kryten But that did have some verbiage at the top. Anton's also resent it directly To you. cheers ^ permalink raw reply [flat|nested] 23+ messages in thread
* Re: [PATCH] sched/cpuacct: Check for NULL when using task_pt_regs() 2016-04-13 11:01 ` Michael Ellerman @ 2016-04-13 11:21 ` Ingo Molnar 0 siblings, 0 replies; 23+ messages in thread From: Ingo Molnar @ 2016-04-13 11:21 UTC (permalink / raw) To: Michael Ellerman Cc: Srikar Dronamraju, Anton Blanchard, Peter Zijlstra, tglx, efault, htejun, linux-kernel, tj, torvalds, zhaolei, yangds.fnst, hpa, Stephen Rothwell, linuxppc-dev * Michael Ellerman <mpe@ellerman.id.au> wrote: > On Wed, 2016-04-13 at 09:43 +0200, Ingo Molnar wrote: > > * Srikar Dronamraju <srikar@linux.vnet.ibm.com> wrote: > > > > > * Anton Blanchard <anton@samba.org> [2016-04-06 21:59:50]: > > > > > > > Looks good, and the patch below does fix the oops for me. > > > > > > > > Anton > > > > -- > > > > > > > > task_pt_regs() can return NULL for kernel threads, so add a check. > > > > This fixes an oops at boot on ppc64. > > > > > > > > Signed-off-by: Anton Blanchard <anton@samba.org> > > > > > > Works for me too. > > > > > > Reported-and-Tested-by: Srikar Dronamraju <srikar@linux.vnet.ibm.com> > > > > Could someone please re-send the fix, because it has not reached me nor lkml. > > It did hit LKML: > > http://lkml.kernel.org/r/20160406215950.04bc3f0b@kryten > > But that did have some verbiage at the top. > > Anton's also resent it directly To you. So it was in my Spam folder, due to the following SPF softfail: Received-SPF: softfail (google.com: domain of transitioning anton@samba.org does not designate 198.145.29.136 as permitted sender) client-ip=198.145.29.136; have the patch now. Thanks, Ingo ^ permalink raw reply [flat|nested] 23+ messages in thread
* Ping? Re: [PATCH] sched/cpuacct: Check for NULL when using task_pt_regs() 2016-04-06 11:59 ` [PATCH] sched/cpuacct: Check for NULL when using task_pt_regs() Anton Blanchard 2016-04-06 13:26 ` Srikar Dronamraju @ 2016-04-11 6:16 ` Michael Ellerman 2016-04-13 11:49 ` [tip:sched/core] " tip-bot for Anton Blanchard 2 siblings, 0 replies; 23+ messages in thread From: Michael Ellerman @ 2016-04-11 6:16 UTC (permalink / raw) To: Anton Blanchard, Peter Zijlstra, Ingo Molnar Cc: Srikar Dronamraju, tglx, efault, htejun, linux-kernel, tj, torvalds, zhaolei, yangds.fnst, hpa, Stephen Rothwell, linuxppc-dev Hi Peter/Ingo, On Wed, 2016-04-06 at 21:59 +1000, Anton Blanchard wrote: > Hi Peter, > > > Ah, so sometihng like: > > > > struct pt_regs *regs = task_pt_regs(); > > int index = CPUACCT_USAGE_SYSTEM; > > > > if (regs && user_mode(regs)) > > index = CPUACCT_USAGE_USER; > > > > should work, right? > > Looks good, and the patch below does fix the oops for me. Can we get this merged please? It's blocking all my boot tests. If anyone's bothered: Fixes: d740037fac70 ("sched/cpuacct: Split usage accounting into user_usage and sys_usage") cheers > task_pt_regs() can return NULL for kernel threads, so add a check. > This fixes an oops at boot on ppc64. > > Signed-off-by: Anton Blanchard <anton@samba.org> > --- > > diff --git a/kernel/sched/cpuacct.c b/kernel/sched/cpuacct.c > index df947e0..41f85c4 100644 > --- a/kernel/sched/cpuacct.c > +++ b/kernel/sched/cpuacct.c > @@ -316,12 +316,11 @@ static struct cftype files[] = { > void cpuacct_charge(struct task_struct *tsk, u64 cputime) > { > struct cpuacct *ca; > - int index; > + int index = CPUACCT_USAGE_SYSTEM; > + struct pt_regs *regs = task_pt_regs(tsk); > > - if (user_mode(task_pt_regs(tsk))) > + if (regs && user_mode(regs)) > index = CPUACCT_USAGE_USER; > - else > - index = CPUACCT_USAGE_SYSTEM; > > rcu_read_lock(); > > ^ permalink raw reply [flat|nested] 23+ messages in thread
* [tip:sched/core] sched/cpuacct: Check for NULL when using task_pt_regs() 2016-04-06 11:59 ` [PATCH] sched/cpuacct: Check for NULL when using task_pt_regs() Anton Blanchard 2016-04-06 13:26 ` Srikar Dronamraju 2016-04-11 6:16 ` Ping? " Michael Ellerman @ 2016-04-13 11:49 ` tip-bot for Anton Blanchard 2 siblings, 0 replies; 23+ messages in thread From: tip-bot for Anton Blanchard @ 2016-04-13 11:49 UTC (permalink / raw) To: linux-tip-commits Cc: hpa, anton, torvalds, linux-kernel, sfr, peterz, mpe, zhaolei, mingo, srikar, tglx Commit-ID: bd92883051a0228cc34996b8e766111ba10c9aac Gitweb: http://git.kernel.org/tip/bd92883051a0228cc34996b8e766111ba10c9aac Author: Anton Blanchard <anton@samba.org> AuthorDate: Wed, 6 Apr 2016 21:59:50 +1000 Committer: Ingo Molnar <mingo@kernel.org> CommitDate: Wed, 13 Apr 2016 13:22:37 +0200 sched/cpuacct: Check for NULL when using task_pt_regs() task_pt_regs() can return NULL for kernel threads, so add a check. This fixes an oops at boot on ppc64. Reported-and-Tested-by: Srikar Dronamraju <srikar@linux.vnet.ibm.com> Tested-by: Zhao Lei <zhaolei@cn.fujitsu.com> Signed-off-by: Anton Blanchard <anton@samba.org> Acked-by: Zhao Lei <zhaolei@cn.fujitsu.com> Cc: Linus Torvalds <torvalds@linux-foundation.org> Cc: Michael Ellerman <mpe@ellerman.id.au> Cc: Peter Zijlstra <peterz@infradead.org> Cc: Stephen Rothwell <sfr@canb.auug.org.au> Cc: Thomas Gleixner <tglx@linutronix.de> Cc: efault@gmx.de Cc: htejun@gmail.com Cc: linuxppc-dev@lists.ozlabs.org Cc: tj@kernel.org Cc: yangds.fnst@cn.fujitsu.com Link: http://lkml.kernel.org/r/20160406215950.04bc3f0b@kryten Signed-off-by: Ingo Molnar <mingo@kernel.org> --- kernel/sched/cpuacct.c | 7 +++---- 1 file changed, 3 insertions(+), 4 deletions(-) diff --git a/kernel/sched/cpuacct.c b/kernel/sched/cpuacct.c index df947e0..41f85c4 100644 --- a/kernel/sched/cpuacct.c +++ b/kernel/sched/cpuacct.c @@ -316,12 +316,11 @@ static struct cftype files[] = { void cpuacct_charge(struct task_struct *tsk, u64 cputime) { struct cpuacct *ca; - int index; + int index = CPUACCT_USAGE_SYSTEM; + struct pt_regs *regs = task_pt_regs(tsk); - if (user_mode(task_pt_regs(tsk))) + if (regs && user_mode(regs)) index = CPUACCT_USAGE_USER; - else - index = CPUACCT_USAGE_SYSTEM; rcu_read_lock(); ^ permalink raw reply related [flat|nested] 23+ messages in thread
* Re: [PATCH v4 0/4] cpuacct: split usage into user_usage and sys_usage 2016-03-22 8:37 [PATCH v4 0/4] cpuacct: split usage into user_usage and sys_usage Zhao Lei ` (3 preceding siblings ...) 2016-03-22 8:37 ` [PATCH v4 4/4] cpuacct: split usage into user_usage and sys_usage Zhao Lei @ 2016-03-22 9:44 ` Peter Zijlstra 4 siblings, 0 replies; 23+ messages in thread From: Peter Zijlstra @ 2016-03-22 9:44 UTC (permalink / raw) To: Zhao Lei; +Cc: linux-kernel, Tejun Heo On Tue, Mar 22, 2016 at 04:37:04PM +0800, Zhao Lei wrote: > Sometimes, cpuacct.usage is not detialed enough to user > to see how much usage a group used. We want to know how > much time it used in user mode and how much in kernel mode. > > This patch introduce some more files to tell user these information. > # ls /sys/fs/cgroup/cpuacct/cpuacct.usage* > /sys/fs/cgroup/cpuacct/cpuacct.usage > /sys/fs/cgroup/cpuacct/cpuacct.usage_percpu > /sys/fs/cgroup/cpuacct/cpuacct.usage_user > /sys/fs/cgroup/cpuacct/cpuacct.usage_percpu_user > /sys/fs/cgroup/cpuacct/cpuacct.usage_sys > /sys/fs/cgroup/cpuacct/cpuacct.usage_percpu_sys > > Changelog v3->v4: > 1: Add a patch to use for_each_possible_cpu to iterate all cpus, > Suggested-by: Peter Zijlstra <peterz@infradead.org> Thanks! ^ permalink raw reply [flat|nested] 23+ messages in thread
end of thread, other threads:[~2016-04-13 11:52 UTC | newest] Thread overview: 23+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2016-03-22 8:37 [PATCH v4 0/4] cpuacct: split usage into user_usage and sys_usage Zhao Lei 2016-03-22 8:37 ` [PATCH v4 1/4] cpuacct: rename parameter in cpuusage_write for readability Zhao Lei 2016-03-22 8:37 ` [PATCH v4 2/4] cpuacct: small restruct for cpuacct Zhao Lei 2016-03-22 8:37 ` [PATCH v4 3/4] cpuacct: Show possible_cpu in cpuacct Zhao Lei 2016-03-31 9:27 ` [tip:sched/core] sched/cpuacct: Show all possible CPUs in cpuacct output tip-bot for Zhao Lei 2016-03-22 8:37 ` [PATCH v4 4/4] cpuacct: split usage into user_usage and sys_usage Zhao Lei 2016-03-31 9:27 ` [tip:sched/core] sched/cpuacct: Split usage accounting " tip-bot for Dongsheng Yang 2016-04-04 14:03 ` Srikar Dronamraju 2016-04-06 6:54 ` Ingo Molnar 2016-04-06 7:47 ` Zhao Lei 2016-04-06 7:50 ` Peter Zijlstra 2016-04-06 10:32 ` Anton Blanchard 2016-04-06 11:08 ` Peter Zijlstra 2016-04-06 11:43 ` Zhao Lei 2016-04-06 11:59 ` [PATCH] sched/cpuacct: Check for NULL when using task_pt_regs() Anton Blanchard 2016-04-06 13:26 ` Srikar Dronamraju 2016-04-06 17:05 ` Zhao Lei 2016-04-13 7:43 ` Ingo Molnar 2016-04-13 11:01 ` Michael Ellerman 2016-04-13 11:21 ` Ingo Molnar 2016-04-11 6:16 ` Ping? " Michael Ellerman 2016-04-13 11:49 ` [tip:sched/core] " tip-bot for Anton Blanchard 2016-03-22 9:44 ` [PATCH v4 0/4] cpuacct: split usage into user_usage and sys_usage Peter Zijlstra
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).