All of lore.kernel.org
 help / color / mirror / Atom feed
From: Barry Song <21cnbao@gmail.com>
To: "Wangshaobo (bobo)" <bobo.shaobowang@huawei.com>
Cc: LKML <linux-kernel@vger.kernel.org>,
	"Rafael J. Wysocki" <rafael@kernel.org>,
	Greg Kroah-Hartman <gregkh@linuxfoundation.org>,
	Sudeep Holla <sudeep.holla@arm.com>,
	Peter Zijlstra <peterz@infradead.org>,
	cj.chengjian@huawei.com, huawei.libin@huawei.com,
	weiyongjun1@huawei.com
Subject: Re: [PATCH] arch_topology: Fix missing clear cluster_cpumask in remove_cpu_topology()
Date: Thu, 11 Nov 2021 22:08:29 +1300	[thread overview]
Message-ID: <CAGsJ_4yLV_-fwtH1=bmGxfcK9_TwUK5Typ_pE=zYQNt=YHoFVA@mail.gmail.com> (raw)
In-Reply-To: <CAGsJ_4xFhcUaVYzVKc2EXs9FsnmPoLmmKqxiDpExwUeTyOyDMg@mail.gmail.com>

On Thu, Nov 11, 2021 at 10:07 PM Barry Song <21cnbao@gmail.com> wrote:
>
> On Thu, Nov 11, 2021 at 8:25 PM Wangshaobo (bobo)
> <bobo.shaobowang@huawei.com> wrote:
> >
> >
> > 在 2021/11/11 14:25, Barry Song 写道:
> >
> > On Wed, Nov 10, 2021 at 10:53 PM Wang ShaoBo <bobo.shaobowang@huawei.com> wrote:
> >
> > When testing cpu online and offline, warning happened like this:
> >
> > [  146.746743] WARNING: CPU: 92 PID: 974 at kernel/sched/topology.c:2215 build_sched_domains+0x81c/0x11b0
> > [  146.749988] CPU: 92 PID: 974 Comm: kworker/92:2 Not tainted 5.15.0 #9
> > [  146.750402] Hardware name: Huawei TaiShan 2280 V2/BC82AMDDA, BIOS 1.79 08/21/2021
> > [  146.751213] Workqueue: events cpuset_hotplug_workfn
> > [  146.751629] pstate: 00400009 (nzcv daif +PAN -UAO -TCO -DIT -SSBS BTYPE=--)
> > [  146.752048] pc : build_sched_domains+0x81c/0x11b0
> > [  146.752461] lr : build_sched_domains+0x414/0x11b0
> > [  146.752860] sp : ffff800040a83a80
> > [  146.753247] x29: ffff800040a83a80 x28: ffff20801f13a980 x27: ffff20800448ae00
> > [  146.753644] x26: ffff800012a858e8 x25: ffff800012ea48c0 x24: 0000000000000000
> > [  146.754039] x23: ffff800010ab7d60 x22: ffff800012f03758 x21: 000000000000005f
> > [  146.754427] x20: 000000000000005c x19: ffff004080012840 x18: ffffffffffffffff
> > [  146.754814] x17: 3661613030303230 x16: 30303078303a3239 x15: ffff800011f92b48
> > [  146.755197] x14: ffff20be3f95cef6 x13: 2e6e69616d6f642d x12: 6465686373204c4c
> > [  146.755578] x11: ffff20bf7fc83a00 x10: 0000000000000040 x9 : 0000000000000000
> > [  146.755957] x8 : 0000000000000002 x7 : ffffffffe0000000 x6 : 0000000000000002
> > [  146.756334] x5 : 0000000090000000 x4 : 00000000f0000000 x3 : 0000000000000001
> > [  146.756705] x2 : 0000000000000080 x1 : ffff800012f03860 x0 : 0000000000000001
> > [  146.757070] Call trace:
> > [  146.757421]  build_sched_domains+0x81c/0x11b0
> > [  146.757771]  partition_sched_domains_locked+0x57c/0x978
> > [  146.758118]  rebuild_sched_domains_locked+0x44c/0x7f0
> > [  146.758460]  rebuild_sched_domains+0x2c/0x48
> > [  146.758791]  cpuset_hotplug_workfn+0x3fc/0x888
> > [  146.759114]  process_one_work+0x1f4/0x480
> > [  146.759429]  worker_thread+0x48/0x460
> > [  146.759734]  kthread+0x158/0x168
> > [  146.760030]  ret_from_fork+0x10/0x20
> > [  146.760318] ---[ end trace 82c44aad6900e81a ]---
> >
> > For some architectures like risc-v and arm64 which use common code
> > clear_cpu_topology() in shutting down CPUx, When CONFIG_SCHED_CLUSTER
> > is set, cluster_sibling in cpu_topology of each sibling adjacent
> > to CPUx is missed clearing, this causes checking failed in
> > topology_span_sane() and rebuilding topology failure at end when CPU online.
> >
> > Different sibling's cluster_sibling in cpu_topology[] when CPU92 offline
> > (CPU 92, 93, 94, 95 are in one cluster):
> >
> > Before revision:
> > CPU                 [92]      [93]      [94]      [95]
> > cluster_sibling     [92]     [92-95]   [92-95]   [92-95]
> >
> > After revision:
> > CPU                 [92]      [93]      [94]      [95]
> > cluster_sibling     [92]     [93-95]   [93-95]   [93-95]
> >
> > Signed-off-by: Wang ShaoBo <bobo.shaobowang@huawei.com>
> >
> > The patch looks correct. But how do you reproduce it?
> >
> > Hi Barry,
> >
> > you can try this test case in kunpeng 920:
> >
> >
> echo 0 > cpu92/online
> echo 0 > cpu93/online
> echo 1 > cpu92/online
>
> Yes. I was making the whole cluster offline. this warning can only be
> reproduced when
> we disable a part of CPUs in one cluster, then enable one of the disabled CPUs.
>
> Acked-by: Barry Song <song.bao.hua@hisilicon.com>
>
> Might need some refine to explain how to reproduce in commit log.

and also a fix tag.

>
> >
> > - Wang ShaoBo
>
> Thanks
> Barry

  reply	other threads:[~2021-11-11  9:08 UTC|newest]

Thread overview: 12+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-11-10  9:58 [PATCH] arch_topology: Fix missing clear cluster_cpumask in remove_cpu_topology() Wang ShaoBo
2021-11-11  6:25 ` Barry Song
     [not found]   ` <943fef84-3920-42bc-b83f-4feaa3ab79f3@huawei.com>
2021-11-11  9:07     ` Barry Song
2021-11-11  9:08       ` Barry Song [this message]
2021-11-11 12:04         ` Dietmar Eggemann
2021-11-11 12:22 ` [tip: sched/urgent] " tip-bot2 for Wang ShaoBo
2021-11-11 14:04   ` Wangshaobo (bobo)
2021-11-11 12:46 ` [PATCH] " Sudeep Holla
2021-11-26 16:28 ` Greg KH
2021-11-26 18:39   ` Sudeep Holla
2021-11-27  9:07     ` Greg KH
2021-11-30  1:08       ` Barry Song

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='CAGsJ_4yLV_-fwtH1=bmGxfcK9_TwUK5Typ_pE=zYQNt=YHoFVA@mail.gmail.com' \
    --to=21cnbao@gmail.com \
    --cc=bobo.shaobowang@huawei.com \
    --cc=cj.chengjian@huawei.com \
    --cc=gregkh@linuxfoundation.org \
    --cc=huawei.libin@huawei.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=peterz@infradead.org \
    --cc=rafael@kernel.org \
    --cc=sudeep.holla@arm.com \
    --cc=weiyongjun1@huawei.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.