All of lore.kernel.org
 help / color / mirror / Atom feed
From: Minchan Kim <minchan@kernel.org>
To: Tejun Heo <tj@kernel.org>
Cc: Jirka Hladky <jhladky@redhat.com>,
	Greg Kroah-Hartman <gregkh@linuxfoundation.org>,
	linux-kernel <linux-kernel@vger.kernel.org>,
	regressions@lists.linux.dev,
	Thorsten Leemhuis <regressions@leemhuis.info>,
	Justin Forbes <jforbes@fedoraproject.org>
Subject: Re: PANIC: "Oops: 0000 [#1] PREEMPT SMP PTI" starting from 5.17 on dual socket Intel Xeon Gold servers
Date: Fri, 22 Apr 2022 11:27:42 -0700	[thread overview]
Message-ID: <YmLznjFdpblHzZiM@google.com> (raw)
In-Reply-To: <YmGKrd1BR9HSEy6q@slm.duckdns.org>

On Thu, Apr 21, 2022 at 06:47:41AM -1000, Tejun Heo wrote:
> Sorry about late reply.
> 
> On Wed, Apr 20, 2022 at 10:02:20AM +0200, Jirka Hladky wrote:
> > > Based on your report, kernel was crashed due to kn_mondata was NULL
> > >
> > >   rdt_kill_sb
> > >     rmdir_all_sub
> > >       ..
> > >       kernfs_remove(kn_mondata);
> > >         struct kernfs_root *root = kernfs_root(kn); <-- crashed
> > >
> > >
> > > Before the my patch[1], it worked like this.
> > >
> > >   rdt_kill_sb
> > >     rmdir_all_sub
> > >       ..
> > >       kernfs_remove(kn_mondata);
> > >         down_write(&kernfs_rwsem);
> > >           if (!kn)
> > >             return;
> > >         up_write(&kernfs_rwsem);
> > >
> > > IOW, before, kernfs_remove worked with NULL argument via just bailing
> > > but with the my patch[1], it doesn't work any longer.
> > >
> > > It makes me have questions for kernfs maintainers:
> > >
> > > Should kernfs_remove API support NULL parameter? If so, can we support
> > > it atomically without old global kernfs_rwsem?
> > >
> > > [1] 393c3714081a, kernfs: switch global kernfs_rwsem lock to per-fs lock
> 
> Yes, I mean, kernfs_remove() used to support NULL arg, so it should do the
> same after the locking change too. Can you send a patch?

Thanks for checking, Tejun.

Jirka, Could you test the patch? Once it's confirmed, I need to resend
it with Ccing stable.

Thanks.

From c7441bc659d2869f2d751b43f27356156e028513 Mon Sep 17 00:00:00 2001
From: Minchan Kim <minchan@kernel.org>
Date: Fri, 22 Apr 2022 11:16:45 -0700
Subject: [PATCH] kernfs: fix NULL dereferencing in kernfs_remove

kernfs_remove supported NULL kernfs_node param to bail out but revent
per-fs lock change introduced regression that dereferencing the
param without NULL check so kernel goes crash.

This patch checks the NULL kernfs_node in kernfs_remove and if so,
just return.

Quote from bug report by Jirka

```
The bug is triggered by running NAS Parallel benchmark suite on
SuperMicro servers with 2x Xeon(R) Gold 6126 CPU. Here is the error
log:

[  247.035564] BUG: kernel NULL pointer dereference, address: 0000000000000008
[  247.036009] #PF: supervisor read access in kernel mode
[  247.036009] #PF: error_code(0x0000) - not-present page
[  247.036009] PGD 0 P4D 0
[  247.036009] Oops: 0000 [#1] PREEMPT SMP PTI
[  247.058060] CPU: 1 PID: 6546 Comm: umount Not tainted
5.16.0393c3714081a53795bbff0e985d24146def6f57f+ #16
[  247.058060] Hardware name: Supermicro Super Server/X11DDW-L, BIOS
2.0b 03/07/2018
[  247.058060] RIP: 0010:kernfs_remove+0x8/0x50
[  247.058060] Code: 4c 89 e0 5b 5d 41 5c 41 5d 41 5e c3 49 c7 c4 f4
ff ff ff eb b2 66 66 2e 0f 1f 84 00 00 00 00 00 66 90 0f 1f 44 00 00
41 54 55 <48> 8b 47 08 48 89 fd 48 85 c0 48 0f 44 c7 4c 8b 60 50 49 83
c4 60
[  247.058060] RSP: 0018:ffffbbfa48a27e48 EFLAGS: 00010246
[  247.058060] RAX: 0000000000000001 RBX: ffffffff89e31f98 RCX: 0000000080200018
[  247.058060] RDX: 0000000080200019 RSI: fffff6760786c900 RDI: 0000000000000000
[  247.058060] RBP: ffffffff89e31f98 R08: ffff926b61b24d00 R09: 0000000080200018
[  247.122048] R10: ffff926b61b24d00 R11: ffff926a8040c000 R12: ffff927bd09a2000
[  247.122048] R13: ffffffff89e31fa0 R14: dead000000000122 R15: dead000000000100
[  247.122048] FS:  00007f01be0a8c40(0000) GS:ffff926fa8e40000(0000)
knlGS:0000000000000000
[  247.122048] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[  247.122048] CR2: 0000000000000008 CR3: 00000001145c6003 CR4: 00000000007706e0
[  247.122048] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[  247.122048] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
[  247.122048] PKRU: 55555554
[  247.122048] Call Trace:
[  247.122048]  <TASK>
[  247.122048]  rdt_kill_sb+0x29d/0x350
[  247.122048]  deactivate_locked_super+0x36/0xa0
[  247.122048]  cleanup_mnt+0x131/0x190
[  247.122048]  task_work_run+0x5c/0x90
[  247.122048]  exit_to_user_mode_prepare+0x229/0x230
[  247.122048]  syscall_exit_to_user_mode+0x18/0x40
[  247.122048]  do_syscall_64+0x48/0x90
[  247.122048]  entry_SYSCALL_64_after_hwframe+0x44/0xae
[  247.122048] RIP: 0033:0x7f01be2d735b
```

Fixes: 393c3714081a (kernfs: switch global kernfs_rwsem lock to per-fs lock)
Reported-by: Jirka Hladky <jhladky@redhat.com>
Signed-off-by: Minchan Kim <minchan@kernel.org>
---
 fs/kernfs/dir.c | 7 ++++++-
 1 file changed, 6 insertions(+), 1 deletion(-)

diff --git a/fs/kernfs/dir.c b/fs/kernfs/dir.c
index 61a8edc4ba8b..e205fde7163a 100644
--- a/fs/kernfs/dir.c
+++ b/fs/kernfs/dir.c
@@ -1406,7 +1406,12 @@ static void __kernfs_remove(struct kernfs_node *kn)
  */
 void kernfs_remove(struct kernfs_node *kn)
 {
-	struct kernfs_root *root = kernfs_root(kn);
+	struct kernfs_root *root;
+
+	if (!kn)
+		return;
+
+	root = kernfs_root(kn);
 
 	down_write(&root->kernfs_rwsem);
 	__kernfs_remove(kn);
-- 
2.36.0.rc2.479.g8af0fa9b8e-goog


  reply	other threads:[~2022-04-22 18:27 UTC|newest]

Thread overview: 21+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-03-21 23:29 PANIC: "Oops: 0000 [#1] PREEMPT SMP PTI" starting from 5.17 on dual socket Intel Xeon Gold servers Jirka Hladky
2022-03-21 23:37 ` Jirka Hladky
2022-03-22  7:12   ` Greg KH
2022-03-22 10:19     ` Jirka Hladky
2022-03-24 11:49 ` Thorsten Leemhuis
2022-03-30 22:16   ` Jirka Hladky
2022-03-30 22:24     ` Jirka Hladky
2022-03-31  0:11       ` Minchan Kim
2022-03-31 14:54         ` Justin Forbes
2022-03-31 16:18           ` Jirka Hladky
2022-03-31 23:33             ` Minchan Kim
2022-04-01 12:04               ` Jirka Hladky
2022-04-04 17:41                 ` Minchan Kim
2022-04-20  8:02                   ` Jirka Hladky
2022-04-21 16:47                     ` Tejun Heo
2022-04-22 18:27                       ` Minchan Kim [this message]
2022-04-22 18:44                         ` Thorsten Leemhuis
2022-04-22 20:09                           ` Minchan Kim
2022-04-25 21:34                             ` Jirka Hladky
2022-04-26  9:43                             ` Greg Kroah-Hartman
2022-04-04  6:37       ` PANIC: "Oops: 0000 [#1] PREEMPT SMP PTI" starting from 5.17 on dual socket Intel Xeon Gold servers #forregzbot Thorsten Leemhuis

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=YmLznjFdpblHzZiM@google.com \
    --to=minchan@kernel.org \
    --cc=gregkh@linuxfoundation.org \
    --cc=jforbes@fedoraproject.org \
    --cc=jhladky@redhat.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=regressions@leemhuis.info \
    --cc=regressions@lists.linux.dev \
    --cc=tj@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.