* [PATCH][RFC] sched/cpuacct: Fix cpuacct charge @ 2021-07-20 6:04 Li RongQing 2021-07-29 10:19 ` Peter Zijlstra 0 siblings, 1 reply; 6+ messages in thread From: Li RongQing @ 2021-07-20 6:04 UTC (permalink / raw) To: mingo, peterz, juri.lelli, vincent.guittot, dietmar.eggemann, rostedt, bsegall, mgorman, bristot, linux-kernel, songmuchun get_irq_regs only work for current running cpu, but the task, whose cpuacct will be charged, maybe run different cpu, like Cpu 2 wake up a kernel thread to CPU 3, cause CPU 3 task are charged with the following stack cpuacct_charge+0xd8/0x100 update_curr+0xe1/0x1e0 enqueue_entity+0x144/0x6e0 enqueue_task_fair+0x93/0x900 ttwu_do_activate+0x4b/0x90 try_to_wake_up+0x20b/0x530 ? update_dl_rq_load_avg+0x10f/0x210 swake_up_locked.part.1+0x13/0x40 swake_up_one+0x27/0x40 rcu_process_callbacks+0x484/0x4f0 ? run_rebalance_domains_bt+0x5a/0x180 __do_softirq+0xe3/0x308 irq_exit+0xf0/0x100 smp_apic_timer_interrupt+0x74/0x160 apic_timer_interrupt+0xf/0x20 </IRQ> RIP: 0033:0x456947 so define a get_irq_regs_cpu which returns the required cpu irq registers BUT it should be not safe, and do not know what it should be like in MIPS? Fixes: dbe9337109c2 "(sched/cpuacct: Fix charge cpuacct.usage_sys)" Co-developed-by: Zhao Jie <zhaojie17@baidu.com> Signed-off-by: Zhao Jie <zhaojie17@baidu.com> Signed-off-by: Li RongQing <lirongqing@baidu.com> --- include/asm-generic/irq_regs.h | 5 +++++ kernel/sched/cpuacct.c | 3 ++- 2 files changed, 7 insertions(+), 1 deletion(-) diff --git a/include/asm-generic/irq_regs.h b/include/asm-generic/irq_regs.h index 2e7c6e8..93e2579 100644 --- a/include/asm-generic/irq_regs.h +++ b/include/asm-generic/irq_regs.h @@ -21,6 +21,11 @@ static inline struct pt_regs *get_irq_regs(void) return __this_cpu_read(__irq_regs); } +static inline struct pt_regs *get_irq_regs_cpu(int cpu) +{ + return per_cpu(__irq_regs, cpu); +} + static inline struct pt_regs *set_irq_regs(struct pt_regs *new_regs) { struct pt_regs *old_regs; diff --git a/kernel/sched/cpuacct.c b/kernel/sched/cpuacct.c index 893eece..8b96058 100644 --- a/kernel/sched/cpuacct.c +++ b/kernel/sched/cpuacct.c @@ -340,7 +340,8 @@ void cpuacct_charge(struct task_struct *tsk, u64 cputime) { struct cpuacct *ca; int index = CPUACCT_STAT_SYSTEM; - struct pt_regs *regs = get_irq_regs() ? : task_pt_regs(tsk); + int cpu = task_cpu(tsk); + struct pt_regs *regs = get_irq_regs_cpu(cpu) ? : task_pt_regs(tsk); if (regs && user_mode(regs)) index = CPUACCT_STAT_USER; -- 2.9.4 ^ permalink raw reply related [flat|nested] 6+ messages in thread
* Re: [PATCH][RFC] sched/cpuacct: Fix cpuacct charge 2021-07-20 6:04 [PATCH][RFC] sched/cpuacct: Fix cpuacct charge Li RongQing @ 2021-07-29 10:19 ` Peter Zijlstra 2021-07-30 8:16 ` 答复: " Li,Rongqing 0 siblings, 1 reply; 6+ messages in thread From: Peter Zijlstra @ 2021-07-29 10:19 UTC (permalink / raw) To: Li RongQing Cc: mingo, juri.lelli, vincent.guittot, dietmar.eggemann, rostedt, bsegall, mgorman, bristot, linux-kernel, songmuchun On Tue, Jul 20, 2021 at 02:04:41PM +0800, Li RongQing wrote: > get_irq_regs only work for current running cpu, but the task, whose > cpuacct will be charged, maybe run different cpu, like Cpu 2 wake > up a kernel thread to CPU 3, cause CPU 3 task are charged with the > following stack > > cpuacct_charge+0xd8/0x100 > update_curr+0xe1/0x1e0 > enqueue_entity+0x144/0x6e0 > enqueue_task_fair+0x93/0x900 > ttwu_do_activate+0x4b/0x90 > try_to_wake_up+0x20b/0x530 > ? update_dl_rq_load_avg+0x10f/0x210 > swake_up_locked.part.1+0x13/0x40 > swake_up_one+0x27/0x40 > rcu_process_callbacks+0x484/0x4f0 > ? run_rebalance_domains_bt+0x5a/0x180 > __do_softirq+0xe3/0x308 > irq_exit+0xf0/0x100 > smp_apic_timer_interrupt+0x74/0x160 > apic_timer_interrupt+0xf/0x20 > </IRQ> > RIP: 0033:0x456947 > > so define a get_irq_regs_cpu which returns the required cpu irq registers > > BUT it should be not safe, and do not know what it should be like in MIPS? > > Fixes: dbe9337109c2 "(sched/cpuacct: Fix charge cpuacct.usage_sys)" > Co-developed-by: Zhao Jie <zhaojie17@baidu.com> > Signed-off-by: Zhao Jie <zhaojie17@baidu.com> > Signed-off-by: Li RongQing <lirongqing@baidu.com> > --- > include/asm-generic/irq_regs.h | 5 +++++ > kernel/sched/cpuacct.c | 3 ++- > 2 files changed, 7 insertions(+), 1 deletion(-) > > diff --git a/include/asm-generic/irq_regs.h b/include/asm-generic/irq_regs.h > index 2e7c6e8..93e2579 100644 > --- a/include/asm-generic/irq_regs.h > +++ b/include/asm-generic/irq_regs.h > @@ -21,6 +21,11 @@ static inline struct pt_regs *get_irq_regs(void) > return __this_cpu_read(__irq_regs); > } > > +static inline struct pt_regs *get_irq_regs_cpu(int cpu) > +{ > + return per_cpu(__irq_regs, cpu); > +} This primitive just cannot be right... it'll get you some random data. ^ permalink raw reply [flat|nested] 6+ messages in thread
* 答复: [PATCH][RFC] sched/cpuacct: Fix cpuacct charge 2021-07-29 10:19 ` Peter Zijlstra @ 2021-07-30 8:16 ` Li,Rongqing 2021-08-16 16:21 ` Daniel Jordan 0 siblings, 1 reply; 6+ messages in thread From: Li,Rongqing @ 2021-07-30 8:16 UTC (permalink / raw) To: Peter Zijlstra Cc: mingo, juri.lelli, vincent.guittot, dietmar.eggemann, rostedt, bsegall, mgorman, bristot, linux-kernel, songmuchun > -----邮件原件----- > 发件人: Peter Zijlstra <peterz@infradead.org> > 发送时间: 2021年7月29日 18:20 > 收件人: Li,Rongqing <lirongqing@baidu.com> > 抄送: mingo@redhat.com; juri.lelli@redhat.com; vincent.guittot@linaro.org; > dietmar.eggemann@arm.com; rostedt@goodmis.org; bsegall@google.com; > mgorman@suse.de; bristot@redhat.co; linux-kernel@vger.kernel.org; > songmuchun@bytedance.com > 主题: Re: [PATCH][RFC] sched/cpuacct: Fix cpuacct charge > > On Tue, Jul 20, 2021 at 02:04:41PM +0800, Li RongQing wrote: > > get_irq_regs only work for current running cpu, but the task, whose > > cpuacct will be charged, maybe run different cpu, like Cpu 2 wake up a > > kernel thread to CPU 3, cause CPU 3 task are charged with the > > following stack > > > > cpuacct_charge+0xd8/0x100 > > update_curr+0xe1/0x1e0 > > enqueue_entity+0x144/0x6e0 > > enqueue_task_fair+0x93/0x900 > > ttwu_do_activate+0x4b/0x90 > > try_to_wake_up+0x20b/0x530 > > ? update_dl_rq_load_avg+0x10f/0x210 > > swake_up_locked.part.1+0x13/0x40 > > swake_up_one+0x27/0x40 > > rcu_process_callbacks+0x484/0x4f0 > > ? run_rebalance_domains_bt+0x5a/0x180 > > __do_softirq+0xe3/0x308 > > irq_exit+0xf0/0x100 > > smp_apic_timer_interrupt+0x74/0x160 > > apic_timer_interrupt+0xf/0x20 > > </IRQ> > > RIP: 0033:0x456947 > > > > so define a get_irq_regs_cpu which returns the required cpu irq > > registers > > > > BUT it should be not safe, and do not know what it should be like in MIPS? > > > > Fixes: dbe9337109c2 "(sched/cpuacct: Fix charge cpuacct.usage_sys)" > > Co-developed-by: Zhao Jie <zhaojie17@baidu.com> > > Signed-off-by: Zhao Jie <zhaojie17@baidu.com> > > Signed-off-by: Li RongQing <lirongqing@baidu.com> > > --- > > include/asm-generic/irq_regs.h | 5 +++++ > > kernel/sched/cpuacct.c | 3 ++- > > 2 files changed, 7 insertions(+), 1 deletion(-) > > > > diff --git a/include/asm-generic/irq_regs.h > > b/include/asm-generic/irq_regs.h index 2e7c6e8..93e2579 100644 > > --- a/include/asm-generic/irq_regs.h > > +++ b/include/asm-generic/irq_regs.h > > @@ -21,6 +21,11 @@ static inline struct pt_regs *get_irq_regs(void) > > return __this_cpu_read(__irq_regs); > > } > > > > +static inline struct pt_regs *get_irq_regs_cpu(int cpu) { > > + return per_cpu(__irq_regs, cpu); > > +} > > This primitive just cannot be right... it'll get you some random data. True Seem no easy to fix. How about a partial fix diff --git a/kernel/sched/cpuacct.c b/kernel/sched/cpuacct.c index 893eece..48b117e 100644 --- a/kernel/sched/cpuacct.c +++ b/kernel/sched/cpuacct.c @@ -340,7 +340,12 @@ void cpuacct_charge(struct task_struct *tsk, u64 cputime) { struct cpuacct *ca; int index = CPUACCT_STAT_SYSTEM; - struct pt_regs *regs = get_irq_regs() ? : task_pt_regs(tsk); + struct pt_regs *regs; + + if (task_cpu(tsk) == raw_smp_processor_id()) + regs = get_irq_regs() ? : task_pt_regs(tsk); + else + regs = task_pt_regs(tsk); if (regs && user_mode(regs)) index = CPUACCT_STAT_USER; ^ permalink raw reply related [flat|nested] 6+ messages in thread
* Re: 答复: [PATCH][RFC] sched/cpuacct: Fix cpuacct charge 2021-07-30 8:16 ` 答复: " Li,Rongqing @ 2021-08-16 16:21 ` Daniel Jordan 2021-08-17 3:55 ` 答复: " Li,Rongqing 0 siblings, 1 reply; 6+ messages in thread From: Daniel Jordan @ 2021-08-16 16:21 UTC (permalink / raw) To: Li,Rongqing Cc: Peter Zijlstra, mingo, juri.lelli, vincent.guittot, dietmar.eggemann, rostedt, bsegall, mgorman, bristot, linux-kernel, songmuchun, Michal Koutný On Fri, Jul 30, 2021 at 08:16:54AM +0000, Li,Rongqing wrote: > > On Tue, Jul 20, 2021 at 02:04:41PM +0800, Li RongQing wrote: > > > get_irq_regs only work for current running cpu, but the task, whose > > > cpuacct will be charged, maybe run different cpu, like Cpu 2 wake up a > > > kernel thread to CPU 3, cause CPU 3 task are charged with the > > > following stack > > > > > > cpuacct_charge+0xd8/0x100 > > > update_curr+0xe1/0x1e0 > > > enqueue_entity+0x144/0x6e0 > > > enqueue_task_fair+0x93/0x900 > > > ttwu_do_activate+0x4b/0x90 > > > try_to_wake_up+0x20b/0x530 > > > ? update_dl_rq_load_avg+0x10f/0x210 > > > swake_up_locked.part.1+0x13/0x40 > > > swake_up_one+0x27/0x40 > > > rcu_process_callbacks+0x484/0x4f0 > > > ? run_rebalance_domains_bt+0x5a/0x180 > > > __do_softirq+0xe3/0x308 > > > irq_exit+0xf0/0x100 > > > smp_apic_timer_interrupt+0x74/0x160 > > > apic_timer_interrupt+0xf/0x20 > > > </IRQ> > > > RIP: 0033:0x456947 > > > > > > so define a get_irq_regs_cpu which returns the required cpu irq > > > registers > > > > > > BUT it should be not safe, and do not know what it should be like in MIPS? > > > > > > Fixes: dbe9337109c2 "(sched/cpuacct: Fix charge cpuacct.usage_sys)" > > > Co-developed-by: Zhao Jie <zhaojie17@baidu.com> > > > Signed-off-by: Zhao Jie <zhaojie17@baidu.com> > > > Signed-off-by: Li RongQing <lirongqing@baidu.com> > > > --- > > > include/asm-generic/irq_regs.h | 5 +++++ > > > kernel/sched/cpuacct.c | 3 ++- > > > 2 files changed, 7 insertions(+), 1 deletion(-) > > > > > > diff --git a/include/asm-generic/irq_regs.h > > > b/include/asm-generic/irq_regs.h index 2e7c6e8..93e2579 100644 > > > --- a/include/asm-generic/irq_regs.h > > > +++ b/include/asm-generic/irq_regs.h > > > @@ -21,6 +21,11 @@ static inline struct pt_regs *get_irq_regs(void) > > > return __this_cpu_read(__irq_regs); > > > } > > > > > > +static inline struct pt_regs *get_irq_regs_cpu(int cpu) { > > > + return per_cpu(__irq_regs, cpu); > > > +} > > > > This primitive just cannot be right... it'll get you some random data. > > True > > Seem no easy to fix. How about a partial fix > > diff --git a/kernel/sched/cpuacct.c b/kernel/sched/cpuacct.c > index 893eece..48b117e 100644 > --- a/kernel/sched/cpuacct.c > +++ b/kernel/sched/cpuacct.c > @@ -340,7 +340,12 @@ void cpuacct_charge(struct task_struct *tsk, u64 cputime) > { > struct cpuacct *ca; > int index = CPUACCT_STAT_SYSTEM; > - struct pt_regs *regs = get_irq_regs() ? : task_pt_regs(tsk); > + struct pt_regs *regs; > + > + if (task_cpu(tsk) == raw_smp_processor_id()) > + regs = get_irq_regs() ? : task_pt_regs(tsk); > + else > + regs = task_pt_regs(tsk); > > if (regs && user_mode(regs)) > index = CPUACCT_STAT_USER; It still suffers from task_pt_regs(). Why not make cpuacct use cgroup2's approach? Remember only delta_exec here, then on reading cpuacct.usage_*, use cputime_adjust() to scale the user/sys from cpuacct_account_field(). It's arguably more than just a fix for cgroup1, but there have been a few complaints about this function lately. > rcu_read_lock(); > > for (ca = task_ca(tsk); ca; ca = parent_ca(ca)) > __this_cpu_add(ca->cpuusage->usages[index], cputime); > > rcu_read_unlock(); By the way, I think the __this_cpu_add() can be wrong in cases like you originally describe. Seems like a bug in 73e6aafd9ea8 ("sched/cpuacct: Simplify the cpuacct code"). ^ permalink raw reply [flat|nested] 6+ messages in thread
* 答复: 答复: [PATCH][RFC] sched/cpuacct: Fix cpuacct charge 2021-08-16 16:21 ` Daniel Jordan @ 2021-08-17 3:55 ` Li,Rongqing 2021-08-18 15:33 ` Daniel Jordan 0 siblings, 1 reply; 6+ messages in thread From: Li,Rongqing @ 2021-08-17 3:55 UTC (permalink / raw) To: Daniel Jordan Cc: Peter Zijlstra, mingo, juri.lelli, vincent.guittot, dietmar.eggemann, rostedt, bsegall, mgorman, bristot, linux-kernel, songmuchun, Michal Koutný > > diff --git a/kernel/sched/cpuacct.c b/kernel/sched/cpuacct.c index > > 893eece..48b117e 100644 > > --- a/kernel/sched/cpuacct.c > > +++ b/kernel/sched/cpuacct.c > > @@ -340,7 +340,12 @@ void cpuacct_charge(struct task_struct *tsk, u64 > > cputime) { > > struct cpuacct *ca; > > int index = CPUACCT_STAT_SYSTEM; > > - struct pt_regs *regs = get_irq_regs() ? : task_pt_regs(tsk); > > + struct pt_regs *regs; > > + > > + if (task_cpu(tsk) == raw_smp_processor_id()) > > + regs = get_irq_regs() ? : task_pt_regs(tsk); > > + else > > + regs = task_pt_regs(tsk); > > > > if (regs && user_mode(regs)) > > index = CPUACCT_STAT_USER; > > It still suffers from task_pt_regs(). > > Why not make cpuacct use cgroup2's approach? Remember only delta_exec > here, then on reading cpuacct.usage_*, use cputime_adjust() to scale the > user/sys from cpuacct_account_field(). > I think your suggestion is reasonable, Could you send a patch > It's arguably more than just a fix for cgroup1, but there have been a few > complaints about this function lately. > > > rcu_read_lock(); > > > > for (ca = task_ca(tsk); ca; ca = parent_ca(ca)) > > __this_cpu_add(ca->cpuusage->usages[index], cputime); > > > > rcu_read_unlock(); > > By the way, I think the __this_cpu_add() can be wrong in cases like you originally > describe. Seems like a bug in 73e6aafd9ea8 ("sched/cpuacct: > Simplify the cpuacct code"). We find this issue too. -Li ^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: 答复: 答复: [PATCH][RFC] sched/cpuacct: Fix cpuacct charge 2021-08-17 3:55 ` 答复: " Li,Rongqing @ 2021-08-18 15:33 ` Daniel Jordan 0 siblings, 0 replies; 6+ messages in thread From: Daniel Jordan @ 2021-08-18 15:33 UTC (permalink / raw) To: Li,Rongqing Cc: Peter Zijlstra, mingo, juri.lelli, vincent.guittot, dietmar.eggemann, rostedt, bsegall, mgorman, bristot, linux-kernel, songmuchun, Michal Koutný On Tue, Aug 17, 2021 at 03:55:08AM +0000, Li,Rongqing wrote: > > > diff --git a/kernel/sched/cpuacct.c b/kernel/sched/cpuacct.c index > > > 893eece..48b117e 100644 > > > --- a/kernel/sched/cpuacct.c > > > +++ b/kernel/sched/cpuacct.c > > > @@ -340,7 +340,12 @@ void cpuacct_charge(struct task_struct *tsk, u64 > > > cputime) { > > > struct cpuacct *ca; > > > int index = CPUACCT_STAT_SYSTEM; > > > - struct pt_regs *regs = get_irq_regs() ? : task_pt_regs(tsk); > > > + struct pt_regs *regs; > > > + > > > + if (task_cpu(tsk) == raw_smp_processor_id()) > > > + regs = get_irq_regs() ? : task_pt_regs(tsk); > > > + else > > > + regs = task_pt_regs(tsk); > > > > > > if (regs && user_mode(regs)) > > > index = CPUACCT_STAT_USER; > > > > It still suffers from task_pt_regs(). > > > > Why not make cpuacct use cgroup2's approach? Remember only delta_exec > > here, then on reading cpuacct.usage_*, use cputime_adjust() to scale the > > user/sys from cpuacct_account_field(). > > > > I think your suggestion is reasonable, Could you send a patch I'll leave that to someone else, got other things going on for now. ^ permalink raw reply [flat|nested] 6+ messages in thread
end of thread, other threads:[~2021-08-18 15:34 UTC | newest] Thread overview: 6+ messages (download: mbox.gz / follow: Atom feed) -- links below jump to the message on this page -- 2021-07-20 6:04 [PATCH][RFC] sched/cpuacct: Fix cpuacct charge Li RongQing 2021-07-29 10:19 ` Peter Zijlstra 2021-07-30 8:16 ` 答复: " Li,Rongqing 2021-08-16 16:21 ` Daniel Jordan 2021-08-17 3:55 ` 答复: " Li,Rongqing 2021-08-18 15:33 ` Daniel Jordan
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).