All of lore.kernel.org
 help / color / mirror / Atom feed
From: Baokun Li <libaokun1@huawei.com>
To: Peter Zijlstra <peterz@infradead.org>
Cc: Yi Zhang <yi.zhang@redhat.com>, Ming Lei <ming.lei@redhat.com>,
	<mark.rutland@arm.com>, Christian Brauner <brauner@kernel.org>,
	<linux-fsdevel@vger.kernel.org>,
	Alexander Viro <viro@zeniv.linux.org.uk>,
	<linux-kernel@vger.kernel.org>, <linux-scsi@vger.kernel.org>,
	Changhui Zhong <czhong@redhat.com>,
	yangerkun <yangerkun@huawei.com>,
	"zhangyi (F)" <yi.zhang@huawei.com>,
	Kees Cook <keescook@chromium.org>,
	chengzhihao <chengzhihao1@huawei.com>,
	Baokun Li <libaokun1@huawei.com>
Subject: Re: [czhong@redhat.com: [bug report] WARNING: CPU: 121 PID: 93233 at fs/dcache.c:365 __dentry_kill+0x214/0x278]
Date: Mon, 18 Sep 2023 09:52:28 +0800	[thread overview]
Message-ID: <a6b10684-39ee-960a-10ab-663746800f85@huawei.com> (raw)
In-Reply-To: <20230917092616.GA8409@noisy.programming.kicks-ass.net>

On 2023/9/17 17:26, Peter Zijlstra wrote:
> On Sun, Sep 17, 2023 at 11:10:32AM +0200, Peter Zijlstra wrote:
>> On Sat, Sep 16, 2023 at 02:55:47PM +0800, Baokun Li wrote:
>>> On 2023/9/13 16:59, Yi Zhang wrote:
>>>> The issue still can be reproduced on the latest linux tree[2].
>>>> To reproduce I need to run about 1000 times blktests block/001, and
>>>> bisect shows it was introduced with commit[1], as it was not 100%
>>>> reproduced, not sure if it's the culprit?
>>>>
>>>>
>>>> [1] 9257959a6e5b locking/atomic: scripts: restructure fallback ifdeffery
>>> Hello, everyone!
>>>
>>> We have confirmed that the merge-in of this patch caused hlist_bl_lock
>>> (aka, bit_spin_lock) to fail, which in turn triggered the issue above.
>>> [root@localhost ~]# insmod mymod.ko
>>> [   37.994787][  T621] >>> a = 725, b = 724
>>> [   37.995313][  T621] ------------[ cut here ]------------
>>> [   37.995951][  T621] kernel BUG at fs/mymod/mymod.c:42!
>>> [r[  oo 3t7@.l996o4c61al]h[o s T6t21] ~ ]#Int ernal error: Oops - BUG:
>>> 00000000f2000800 [#1] SMP
>>> [   37.997420][  T621] Modules linked in: mymod(E)
>>> [   37.997891][  T621] CPU: 9 PID: 621 Comm: bl_lock_thread2 Tainted:
>>> G            E      6.4.0-rc2-00034-g9257959a6e5b-dirty #117
>>> [   37.999038][  T621] Hardware name: linux,dummy-virt (DT)
>>> [   37.999571][  T621] pstate: 60400005 (nZCv daif +PAN -UAO -TCO -DIT -SSBS
>>> BTYPE=--)
>>> [   38.000344][  T621] pc : increase_ab+0xcc/0xe70 [mymod]
>>> [   38.000882][  T621] lr : increase_ab+0xcc/0xe70 [mymod]
>>> [   38.001416][  T621] sp : ffff800008b4be40
>>> [   38.001822][  T621] x29: ffff800008b4be40 x28: 0000000000000000 x27:
>>> 0000000000000000
>>> [   38.002605][  T621] x26: 0000000000000000 x25: 0000000000000000 x24:
>>> 0000000000000000
>>> [   38.003385][  T621] x23: ffffd9930c698190 x22: ffff800008a0ba38 x21:
>>> 0000000000000001
>>> [   38.004174][  T621] x20: ffffffffffffefff x19: ffffd9930c69a580 x18:
>>> 0000000000000000
>>> [   38.004955][  T621] x17: 0000000000000000 x16: ffffd9933011bd38 x15:
>>> ffffffffffffffff
>>> [   38.005754][  T621] x14: 0000000000000000 x13: 205d313236542020 x12:
>>> ffffd99332175b80
>>> [   38.006538][  T621] x11: 0000000000000003 x10: 0000000000000001 x9 :
>>> ffffd9933022a9d8
>>> [   38.007325][  T621] x8 : 00000000000bffe8 x7 : c0000000ffff7fff x6 :
>>> ffffd993320b5b40
>>> [   38.008124][  T621] x5 : ffff0001f7d1c708 x4 : 0000000000000000 x3 :
>>> 0000000000000000
>>> [   38.008912][  T621] x2 : 0000000000000000 x1 : 0000000000000000 x0 :
>>> 0000000000000015
>>> [   38.009709][  T621] Call trace:
>>> [   38.010035][  T621]  increase_ab+0xcc/0xe70 [mymod]
>>> [   38.010539][  T621]  kthread+0xdc/0xf0
>>> [   38.010927][  T621]  ret_from_fork+0x10/0x20
>>> [   38.011370][  T621] Code: 17ffffe0 90000020 91044000 9400000d (d4210000)
>>> [   38.012067][  T621] ---[ end trace 0000000000000000 ]---
>> Is this arm64 or something? You seem to have forgotten to mention what
>> platform you're using.
> Is that an LSE or LLSC arm64 ?

I'm not sure how to distinguish if it's LSE or LLSC, here's some info on 
the cpu:

$ cat /sys/devices/system/cpu/cpu0/regs/identification/midr_el1
0x00000000481fd010

$ lscpu
Architecture:        aarch64
Byte Order:          Little Endian
CPU(s):              96
On-line CPU(s) list: 0-95
Thread(s) per core:  1
Core(s) per socket:  48
Socket(s):           2
NUMA node(s):        4
Vendor ID:           HiSilicon
BIOS Vendor ID:      HiSilicon
Model:               0
Model name:          Kunpeng-920
BIOS Model name:     Kunpeng 920-4826
Stepping:            0x1
BogoMIPS:            200.00
L1d cache:           64K
L1i cache:           64K
L2 cache:            512K
L3 cache:            49152K
NUMA node0 CPU(s):   0-23
NUMA node1 CPU(s):   24-47
NUMA node2 CPU(s):   48-71
NUMA node3 CPU(s):   72-95
Flags:               fp asimd evtstrm aes pmull sha1 sha2 crc32 atomics 
fphp asimdhp cpuid asimdrdm jscvt fcma dcpop asimddp asimdfhm

> Anyway, it seems that ARM64 shouldn't be using the fallback as it does
> everything itself.
>
> Mark, can you have a look please? At first glance the
> atomic64_fetch_or_acquire() that's being used by generic bitops/lock.h
> seems in order..
>
We also suspect some implicit mechanism change in
raw_atomic64_fetch_or_acquire. You can reproduce the problem with the
above mod that can reproduce the problem to make it easier to locate.
I can help reproduce it and grab some information if you can't reproduce
it on your end.

-- 
With Best Regards,
Baokun Li
.

  reply	other threads:[~2023-09-18  1:53 UTC|newest]

Thread overview: 14+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-08-23  4:06 [czhong@redhat.com: [bug report] WARNING: CPU: 121 PID: 93233 at fs/dcache.c:365 __dentry_kill+0x214/0x278] Ming Lei
2023-08-23  8:47 ` Christian Brauner
2023-08-28 10:43   ` Ming Lei
2023-09-13  8:59     ` Yi Zhang
2023-09-16  6:55       ` Baokun Li
2023-09-17  9:10         ` Peter Zijlstra
2023-09-17  9:26           ` Peter Zijlstra
2023-09-18  1:52             ` Baokun Li [this message]
2023-09-18 18:42               ` Darrick J. Wong
2023-09-18  1:10           ` Baokun Li
2023-09-18 10:20             ` Yi Zhang
2023-09-19 15:10         ` Mark Rutland
2023-09-17  0:35       ` Bagas Sanjaya
2023-09-29 13:24         ` Linux regression tracking #update (Thorsten Leemhuis)

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=a6b10684-39ee-960a-10ab-663746800f85@huawei.com \
    --to=libaokun1@huawei.com \
    --cc=brauner@kernel.org \
    --cc=chengzhihao1@huawei.com \
    --cc=czhong@redhat.com \
    --cc=keescook@chromium.org \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-scsi@vger.kernel.org \
    --cc=mark.rutland@arm.com \
    --cc=ming.lei@redhat.com \
    --cc=peterz@infradead.org \
    --cc=viro@zeniv.linux.org.uk \
    --cc=yangerkun@huawei.com \
    --cc=yi.zhang@huawei.com \
    --cc=yi.zhang@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.