* [f2fs-dev] [Bug 210745] New: kernel crash during umounting a partition with f2fs filesystem
@ 2020-12-17 6:43 bugzilla-daemon
2020-12-18 10:27 ` [f2fs-dev] [Bug 210745] " bugzilla-daemon
` (3 more replies)
0 siblings, 4 replies; 5+ messages in thread
From: bugzilla-daemon @ 2020-12-17 6:43 UTC (permalink / raw)
To: linux-f2fs-devel
https://bugzilla.kernel.org/show_bug.cgi?id=210745
Bug ID: 210745
Summary: kernel crash during umounting a partition with f2fs
filesystem
Product: File System
Version: 2.5
Kernel Version: 4.14.193
Hardware: All
OS: Linux
Tree: Mainline
Status: NEW
Severity: high
Priority: P1
Component: f2fs
Assignee: filesystem_f2fs@kernel-bugs.kernel.org
Reporter: Zhiguo.Niu@unisoc.com
Regression: No
Hi,
When we do the reboot stress test in a device, we may encounter the following
kernel crash occasionally.
[ 42.035226] c6 Unable to handle kernel NULL pointer dereference at virtual
address 0000000a
[ 43.437464] c6 __list_del_entry_valid+0xc/0xd8
[ 43.441962] c6 f2fs_destroy_node_manager+0x218/0x398
[ 43.446984] c6 f2fs_put_super+0x19c/0x2b8
[ 43.451052] c6 generic_shutdown_super+0x70/0xf8
[ 43.455635] c6 kill_block_super+0x2c/0x5c
[ 43.459702] c6 kill_f2fs_super+0xac/0xd8
[ 43.463684] c6 deactivate_locked_super+0x5c/0x124
[ 43.468442] c6 deactivate_super+0x5c/0x68
[ 43.472512] c6 cleanup_mnt+0x9c/0x118
[ 43.476231] c6 __cleanup_mnt+0x1c/0x28
[ 43.480043] c6 task_work_run+0x88/0xa8
[ 43.483850] c6 do_notify_resume+0x39c/0x1c88
[ 43.488174] c6 work_pending+0x8/0x14
the code of crash point is:
f2fs/node.c
void f2fs_destroy_node_manager(struct f2fs_sb_info *sbi)
while ((found = __gang_lookup_nat_cache(nm_i,
nid, NATVEC_SIZE, natvec))) {
unsigned idx;
nid = nat_get_nid(natvec[found - 1]) + 1;
for (idx = 0; idx < found; idx++) {
spin_lock(&nm_i->nat_list_lock);
> list_del(&natvec[idx]->list);
spin_unlock(&nm_i->nat_list_lock);
__del_from_nat_cache(nm_i, natvec[idx]);
}
}
because of the current nat entry in natvec[idx] is a invalid pointer or its
member list has null next member.
We have encountered this issue for several times in both Andoird Q & R version
I analyze these issue as following:
1. the current nat can be found in stack, like as "a"
ffffff800806b8d0: ffffffc0af33cbc0 ffffffc0af4869a0
> ffffff800806b8e0: ffffffc0f49baa00 000000000000000a
ffffff800806b8f0: ffffffc0af33c040 ffffffc0c69f0e20
ffffff800806b900: ffffffc0c695abc0 ffffffc01e2a4460
2.these invalid entry can be found in nat_root radix tree of f2fs_nm_info
3. I have reviewed the codes about nat_tree_lock, and has not any clues
please let me know if you need any other information
thanks a lot.
--
You are receiving this mail because:
You are watching the assignee of the bug.
_______________________________________________
Linux-f2fs-devel mailing list
Linux-f2fs-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/linux-f2fs-devel
^ permalink raw reply [flat|nested] 5+ messages in thread
* [f2fs-dev] [Bug 210745] kernel crash during umounting a partition with f2fs filesystem
2020-12-17 6:43 [f2fs-dev] [Bug 210745] New: kernel crash during umounting a partition with f2fs filesystem bugzilla-daemon
@ 2020-12-18 10:27 ` bugzilla-daemon
2020-12-21 8:09 ` bugzilla-daemon
` (2 subsequent siblings)
3 siblings, 0 replies; 5+ messages in thread
From: bugzilla-daemon @ 2020-12-18 10:27 UTC (permalink / raw)
To: linux-f2fs-devel
https://bugzilla.kernel.org/show_bug.cgi?id=210745
Chao Yu (chao@kernel.org) changed:
What |Removed |Added
----------------------------------------------------------------------------
Status|NEW |NEEDINFO
CC| |chao@kernel.org
--- Comment #1 from Chao Yu (chao@kernel.org) ---
Hi,
I checked the code of 4.14.193, I don't have any clue about why this can
happen,
and I don't remember that there is such corruption condition occured on nid
list, because all its update is under nat_tree_lock, let me know if I missed
something.
Do you apply private patch on 4.14.193?
--
You may reply to this email to add a comment.
You are receiving this mail because:
You are watching the assignee of the bug.
_______________________________________________
Linux-f2fs-devel mailing list
Linux-f2fs-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/linux-f2fs-devel
^ permalink raw reply [flat|nested] 5+ messages in thread
* [f2fs-dev] [Bug 210745] kernel crash during umounting a partition with f2fs filesystem
2020-12-17 6:43 [f2fs-dev] [Bug 210745] New: kernel crash during umounting a partition with f2fs filesystem bugzilla-daemon
2020-12-18 10:27 ` [f2fs-dev] [Bug 210745] " bugzilla-daemon
@ 2020-12-21 8:09 ` bugzilla-daemon
2020-12-21 8:29 ` bugzilla-daemon
2020-12-21 8:44 ` bugzilla-daemon
3 siblings, 0 replies; 5+ messages in thread
From: bugzilla-daemon @ 2020-12-21 8:09 UTC (permalink / raw)
To: linux-f2fs-devel
https://bugzilla.kernel.org/show_bug.cgi?id=210745
--- Comment #2 from Zhiguo.Niu (Zhiguo.Niu@unisoc.com) ---
(In reply to Chao Yu from comment #1)
> Hi,
>
> I checked the code of 4.14.193, I don't have any clue about why this can
> happen,
> and I don't remember that there is such corruption condition occured on nid
> list, because all its update is under nat_tree_lock, let me know if I missed
> something.
>
> Do you apply private patch on 4.14.193?
hi Chao,
Thanks for your reply, I have checked my codebase, there is no any other
private patches in current version.
I find that local variables natvec & setvec in f2fs_destroy_node_manager may be
inited as 0xaa and 0xaaaaaaaaaaaaaaaa, just like :
void f2fs_destroy_node_manager(struct f2fs_sb_info *sbi)
{
struct f2fs_nm_info *nm_i = NM_I(sbi);
struct free_nid *i, *next_i;
struct nat_entry *natvec[NATVEC_SIZE];
struct nat_entry_set *setvec[SETVEC_SIZE];
dis:
crash_arm64> dis f2fs_destroy_node_manager
0xffffff800842e2a8 <f2fs_destroy_node_manager>: stp x29, x30, [sp,#-96]!
0xffffff800842e2ac <f2fs_destroy_node_manager+4>: stp x28, x27,
[sp,#16]
0xffffff800842e2b0 <f2fs_destroy_node_manager+8>: stp x26, x25,
[sp,#32]
0xffffff800842e2b4 <f2fs_destroy_node_manager+12>: stp x24, x23,
[sp,#48]
0xffffff800842e2b8 <f2fs_destroy_node_manager+16>: stp x22, x21,
[sp,#64]
0xffffff800842e2bc <f2fs_destroy_node_manager+20>: stp x20, x19,
[sp,#80]
0xffffff800842e2c0 <f2fs_destroy_node_manager+24>: mov x29, sp
0xffffff800842e2c4 <f2fs_destroy_node_manager+28>: sub sp, sp, #0x320
0xffffff800842e2c8 <f2fs_destroy_node_manager+32>: adrp x8,
0xffffff800947e000 <xt_connlimit_locks+768>
0xffffff800842e2cc <f2fs_destroy_node_manager+36>: ldr x8, [x8,#264]
0xffffff800842e2d0 <f2fs_destroy_node_manager+40>: mov x27, x0
0xffffff800842e2d4 <f2fs_destroy_node_manager+44>: str x8, [x29,#-16]
0xffffff800842e2d8 <f2fs_destroy_node_manager+48>: nop
0xffffff800842e2dc <f2fs_destroy_node_manager+52>: ldr x20, [x27,#112]
0xffffff800842e2e0 <f2fs_destroy_node_manager+56>: add x0, sp, #0x110
0xffffff800842e2e4 <f2fs_destroy_node_manager+60>: mov w1, #0xaa
// #170
0xffffff800842e2e8 <f2fs_destroy_node_manager+64>: mov w2, #0x200
// #512
0xffffff800842e2ec <f2fs_destroy_node_manager+68>: bl
0xffffff8008be6b80 <__memset>
0xffffff800842e2f0 <f2fs_destroy_node_manager+72>: mov x8,
#0xaaaaaaaaaaaaaaaa // #-6148914691236517206
0xffffff800842e2f4 <f2fs_destroy_node_manager+76>: stp x8, x8,
[sp,#256]
0xffffff800842e2f8 <f2fs_destroy_node_manager+80>: stp x8, x8,
[sp,#240]
0xffffff800842e2fc <f2fs_destroy_node_manager+84>: stp x8, x8,
[sp,#224]
0xffffff800842e300 <f2fs_destroy_node_manager+88>: stp x8, x8,
[sp,#208]
0xffffff800842e304 <f2fs_destroy_node_manager+92>: stp x8, x8,
[sp,#192]
0xffffff800842e308 <f2fs_destroy_node_manager+96>: stp x8, x8,
[sp,#176]
0xffffff800842e30c <f2fs_destroy_node_manager+100>: stp x8, x8,
[sp,#160]
0xffffff800842e310 <f2fs_destroy_node_manager+104>: stp x8, x8,
[sp,#144]
0xffffff800842e314 <f2fs_destroy_node_manager+108>: stp x8, x8,
[sp,#128]
0xffffff800842e318 <f2fs_destroy_node_manager+112>: stp x8, x8,
[sp,#112]
0xffffff800842e31c <f2fs_destroy_node_manager+116>: stp x8, x8,
[sp,#96]
0xffffff800842e320 <f2fs_destroy_node_manager+120>: stp x8, x8,
[sp,#80]
0xffffff800842e324 <f2fs_destroy_node_manager+124>: stp x8, x8,
[sp,#64]
0xffffff800842e328 <f2fs_destroy_node_manager+128>: stp x8, x8,
[sp,#48]
0xffffff800842e32c <f2fs_destroy_node_manager+132>: stp x8, x8,
[sp,#32]
0xffffff800842e330 <f2fs_destroy_node_manager+136>: stp x8, x8,
[sp,#16]
I am not sure this is the root cause about this issue, because these invalid
entry can be found in nat_root radix tree of f2fs_nm_info
thanks!
thanks!
--
You may reply to this email to add a comment.
You are receiving this mail because:
You are watching the assignee of the bug.
_______________________________________________
Linux-f2fs-devel mailing list
Linux-f2fs-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/linux-f2fs-devel
^ permalink raw reply [flat|nested] 5+ messages in thread
* [f2fs-dev] [Bug 210745] kernel crash during umounting a partition with f2fs filesystem
2020-12-17 6:43 [f2fs-dev] [Bug 210745] New: kernel crash during umounting a partition with f2fs filesystem bugzilla-daemon
2020-12-18 10:27 ` [f2fs-dev] [Bug 210745] " bugzilla-daemon
2020-12-21 8:09 ` bugzilla-daemon
@ 2020-12-21 8:29 ` bugzilla-daemon
2020-12-21 8:44 ` bugzilla-daemon
3 siblings, 0 replies; 5+ messages in thread
From: bugzilla-daemon @ 2020-12-21 8:29 UTC (permalink / raw)
To: linux-f2fs-devel
https://bugzilla.kernel.org/show_bug.cgi?id=210745
--- Comment #3 from Chao Yu (chao@kernel.org) ---
nm_i->nat_list_lock was introduced in 4.19, are you sure your codebase is
4.14.193?
--
You may reply to this email to add a comment.
You are receiving this mail because:
You are watching the assignee of the bug.
_______________________________________________
Linux-f2fs-devel mailing list
Linux-f2fs-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/linux-f2fs-devel
^ permalink raw reply [flat|nested] 5+ messages in thread
* [f2fs-dev] [Bug 210745] kernel crash during umounting a partition with f2fs filesystem
2020-12-17 6:43 [f2fs-dev] [Bug 210745] New: kernel crash during umounting a partition with f2fs filesystem bugzilla-daemon
` (2 preceding siblings ...)
2020-12-21 8:29 ` bugzilla-daemon
@ 2020-12-21 8:44 ` bugzilla-daemon
3 siblings, 0 replies; 5+ messages in thread
From: bugzilla-daemon @ 2020-12-21 8:44 UTC (permalink / raw)
To: linux-f2fs-devel
https://bugzilla.kernel.org/show_bug.cgi?id=210745
--- Comment #4 from Chao Yu (chao@kernel.org) ---
(In reply to Zhiguo.Niu from comment #2)
> hi Chao,
>
> Thanks for your reply, I have checked my codebase, there is no any other
> private patches in current version.
>
> I find that local variables natvec & setvec in f2fs_destroy_node_manager may
> be inited as 0xaa and 0xaaaaaaaaaaaaaaaa, just like :
>
> void f2fs_destroy_node_manager(struct f2fs_sb_info *sbi)
> {
> struct f2fs_nm_info *nm_i = NM_I(sbi);
> struct free_nid *i, *next_i;
> struct nat_entry *natvec[NATVEC_SIZE];
> struct nat_entry_set *setvec[SETVEC_SIZE];
>
I don't think so, natvec array will be assigned in __gang_lookup_nat_cache(),
and natvec[0..found - 1] will be valid, in "destroy nat cache" loop, we will
not access natvec array out-of-range.
Can you please check whether @found is valid or not (@found should be less or
equal than NATVEC_SIZE)?
BTW, one possible case could be stack overflow, but during umount(), would
that really happen?
--
You may reply to this email to add a comment.
You are receiving this mail because:
You are watching the assignee of the bug.
_______________________________________________
Linux-f2fs-devel mailing list
Linux-f2fs-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/linux-f2fs-devel
^ permalink raw reply [flat|nested] 5+ messages in thread
end of thread, other threads:[~2020-12-21 8:44 UTC | newest]
Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2020-12-17 6:43 [f2fs-dev] [Bug 210745] New: kernel crash during umounting a partition with f2fs filesystem bugzilla-daemon
2020-12-18 10:27 ` [f2fs-dev] [Bug 210745] " bugzilla-daemon
2020-12-21 8:09 ` bugzilla-daemon
2020-12-21 8:29 ` bugzilla-daemon
2020-12-21 8:44 ` bugzilla-daemon
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.