linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Linus Torvalds <torvalds@linux-foundation.org>
To: kernel test robot <oliver.sang@intel.com>
Cc: Steven Rostedt <rostedt@goodmis.org>,
	oe-lkp@lists.linux.dev, lkp@intel.com,
	 linux-kernel@vger.kernel.org,
	Masami Hiramatsu <mhiramat@kernel.org>,
	 Mark Rutland <mark.rutland@arm.com>,
	Mathieu Desnoyers <mathieu.desnoyers@efficios.com>,
	 Christian Brauner <brauner@kernel.org>,
	Al Viro <viro@zeniv.linux.org.uk>,
	 Ajay Kaher <ajay.kaher@broadcom.com>,
	linux-trace-kernel@vger.kernel.org
Subject: Re: [linus:master] [eventfs] 852e46e239: BUG:unable_to_handle_page_fault_for_address
Date: Sun, 28 Jan 2024 20:36:12 -0800	[thread overview]
Message-ID: <CAHk-=wh0M=e8R=ZXxa4vesLTtvGmYWJ-w1VmXxW5Mva=Nimk4Q@mail.gmail.com> (raw)
In-Reply-To: <202401291043.e62e89dc-oliver.sang@intel.com>

On Sun, 28 Jan 2024 at 18:59, kernel test robot <oliver.sang@intel.com> wrote:
>
>   21:   48 8b 47 f8             mov    -0x8(%rdi),%rax
>   25:   48 85 c0                test   %rax,%rax
>   28:   74 11                   je     0x3b
>   2a:*  f6 40 78 02             testb  $0x2,0x78(%rax)          <-- trapping instruction

So this is

        struct tracefs_inode *ti = get_tracefs(inode);
        struct eventfs_inode *ei = ti->private;

        if (!ei || !ei->is_events || ..

in set_top_events_ownership(), and it's that 'ei->is_events' test that oopses.

The 'inode' is the incoming argument (in %rdi), and you don't see code
generation for the "get_tracefs(inode)" because it's just an offset
off the inode.

So the "ti->private" read is that read off "-8(%rdi)", because

  struct tracefs_inode {
        unsigned long           flags;
        void                    *private;
        struct inode            vfs_inode;
  };

so 'private' is 8 bytes below the 'struct inode' pointer.

So 'ei' isn't NULL, but it's a bad pointer, and 'ei->is_events' is
that "testb  $0x2,0x78(%rax)" and it oopses as a result.

I don't think this is directly related to that commit 852e46e239ee
("eventfs: Do not create dentries nor inodes in iterate_shared") that
the kernel test robot talks about.

It looks like some inode creation never filled in the ->private field
at all, and it's just garbage.

I note that we have code like this:

 create_dir_dentry():
        ...
        mutex_unlock(&eventfs_mutex);

        dentry = create_dir(ei, parent);

        mutex_lock(&eventfs_mutex);
        ...
        if (!ei->dentry && !ei->is_freed) {
                ei->dentry = dentry;
                eventfs_post_create_dir(ei);
                dentry->d_fsdata = ei;
        } else {

and that eventfs_post_create_dir() seems to be where it fills in that
->private pointer:

        ti = get_tracefs(ei->dentry->d_inode);
        ti->private = ei;

but notice how create_dir() has done that

        d_instantiate(dentry, inode);

which exposes the inode to lookup - long before it's actually then filled in.

IOW, what I think is going on is a very basic race, where
create_dir_dentry() will allocate the inode and attach it to the
dentry long before the inode has been fully initialized.

So if somebody comes in *immediately* after that, and does a 'stat()'
on that name that now is exposed, it will see an inode that has not
yet made it to that eventfs_post_create_dir() phase, and that in turn
explains why

        struct eventfs_inode *ei = ti->private;

just reads random garbage values.

So if I'm right (big "if" - it looks likely, but who knows) this bug
is entirely unrelated to any dentry caching or any reference counting.

It just needs just the right timing, where one CPU happens to do a
'stat()' just as another one creates the directory.

It's not a big window, so you do need some timing "luck".

End result: the d_instantiate() needs to be done *after* the inode has
been fully filled in.

Alternatively, you could

 (a) not drop the eventfs_mutex around the create_dir() call

 (b) take the eventfs_mutex around all of set_top_events_ownership()

and just fix it by having the lock protect the lack of ordering.

                 Linus

  reply	other threads:[~2024-01-29  4:36 UTC|newest]

Thread overview: 47+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-01-29  2:58 [linus:master] [eventfs] 852e46e239: BUG:unable_to_handle_page_fault_for_address kernel test robot
2024-01-29  4:36 ` Linus Torvalds [this message]
2024-01-29 17:01   ` Steven Rostedt
2024-01-29 17:40     ` Linus Torvalds
2024-01-29 17:44       ` Steven Rostedt
2024-01-29 17:45         ` Steven Rostedt
2024-01-29 17:56         ` Linus Torvalds
2024-01-29 17:55       ` Linus Torvalds
2024-01-29 19:24       ` Linus Torvalds
2024-01-29 19:51         ` Linus Torvalds
2024-01-29 20:26           ` Steven Rostedt
2024-01-29 20:51             ` Linus Torvalds
2024-01-29 21:45               ` Steven Rostedt
2024-01-29 22:19                 ` Linus Torvalds
2024-01-29 21:55               ` Steven Rostedt
2024-01-29 22:22               ` Steven Rostedt
2024-01-29 22:35                 ` Linus Torvalds
2024-01-29 22:42                   ` Linus Torvalds
2024-01-29 22:49                     ` Steven Rostedt
2024-01-30  0:01                       ` Linus Torvalds
2024-01-30  0:35                         ` Steven Rostedt
2024-01-30  1:50                           ` Linus Torvalds
2024-01-30  3:56                             ` Linus Torvalds
2024-01-30  8:43                               ` Linus Torvalds
2024-01-30  9:12                                 ` Linus Torvalds
2024-01-30 12:45                                   ` Rasmus Villemoes
2024-01-30 14:39                                   ` Steven Rostedt
2024-01-30 16:49                                     ` Steven Rostedt
2024-01-30 16:55                                       ` Linus Torvalds
2024-01-30 17:06                                         ` Steven Rostedt
2024-01-30 17:09                                         ` Linus Torvalds
2024-01-30 16:51                                     ` Linus Torvalds
2024-01-30 18:23                               ` Steven Rostedt
2024-01-30 19:19                                 ` Linus Torvalds
2024-01-30 19:37                                   ` Steven Rostedt
2024-01-30 19:54                                     ` Linus Torvalds
2024-01-30 20:04                                       ` Steven Rostedt
2024-01-31 15:58                                       ` Steven Rostedt
2024-01-31 16:13                                         ` Steven Rostedt
2024-01-31 17:28                                           ` Steven Rostedt
2024-01-31 17:26                                         ` Steven Rostedt
2024-01-31 19:35                                         ` Linus Torvalds
2024-01-31 19:42                                           ` Steven Rostedt
2024-01-30 18:36                               ` Steven Rostedt
2024-01-29 22:47                   ` Steven Rostedt
2024-01-29 23:15                     ` Steven Rostedt
2024-01-30  2:08   ` Al Viro

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='CAHk-=wh0M=e8R=ZXxa4vesLTtvGmYWJ-w1VmXxW5Mva=Nimk4Q@mail.gmail.com' \
    --to=torvalds@linux-foundation.org \
    --cc=ajay.kaher@broadcom.com \
    --cc=brauner@kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-trace-kernel@vger.kernel.org \
    --cc=lkp@intel.com \
    --cc=mark.rutland@arm.com \
    --cc=mathieu.desnoyers@efficios.com \
    --cc=mhiramat@kernel.org \
    --cc=oe-lkp@lists.linux.dev \
    --cc=oliver.sang@intel.com \
    --cc=rostedt@goodmis.org \
    --cc=viro@zeniv.linux.org.uk \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).