All of lore.kernel.org
 help / color / mirror / Atom feed
From: Marco Elver <elver@google.com>
To: "Theodore Ts'o" <tytso@mit.edu>
Cc: Hillf Danton <hdanton@sina.com>,
	Matthew Wilcox <willy@infradead.org>,
	Al Viro <viro@zeniv.linux.org.uk>,
	syzbot <syzbot+919c5a9be8433b8bf201@syzkaller.appspotmail.com>,
	linux-kernel@vger.kernel.org, linux-mm@kvack.org,
	syzkaller-bugs@googlegroups.com,
	Aleksandr Nogikh <nogikh@google.com>
Subject: Re: [syzbot] WARNING in do_mkdirat
Date: Mon, 12 Dec 2022 20:29:10 +0100	[thread overview]
Message-ID: <CANpmjNNCQEXpJt1PQptyr8mrBbhWpToCRfvUT+RXmw5EA5EwVw@mail.gmail.com> (raw)
In-Reply-To: <Y5d565XVsinbNNL2@mit.edu>

On Mon, 12 Dec 2022 at 19:58, Theodore Ts'o <tytso@mit.edu> wrote:
>
> On Mon, Dec 12, 2022 at 11:29:11AM +0800, Hillf Danton wrote:
> > > You've completely misunderstood Al's point.  He's not whining about
> > > being cc'd, he's pointing at this is ONLY USEFUL IF THE NTFS3
> > > MAINTAINERS ARE CC'd.  And they're not.  So this is just noise.
> > > And enough noise means that signal is lost.
> >
> > Call Trace:
> >  <TASK>
> >  inode_unlock include/linux/fs.h:761 [inline]
> >  done_path_create fs/namei.c:3857 [inline]
> >  do_mkdirat+0x2de/0x550 fs/namei.c:4064
> >  __do_sys_mkdirat fs/namei.c:4076 [inline]
> >  __se_sys_mkdirat fs/namei.c:4074 [inline]
> >  __x64_sys_mkdirat+0x85/0x90 fs/namei.c:4074
> >  do_syscall_x64 arch/x86/entry/common.c:50 [inline]
> >  do_syscall_64+0x3d/0xb0 arch/x86/entry/common.c:80
> >  entry_SYSCALL_64_after_hwframe+0x63/0xcd
> >
> > Given the call trace above, how do you know the ntfs3 guys should be also
> > Cced in addition to AV? What if it would take more than three months for
> > syzbot to learn the skills in your mind? What is preventing you routing
> > the report to ntfs3?
>
> If it takes 3 months for syzbot to take a look at the source code in
> their own #!@?! reproducer, or just to take a look at the strace link
> in the dashboard:
>
> [pid  3639] mount("/dev/loop0", "./file2", "ntfs3", MS_NOSUID|MS_NOEXEC|MS_DIRSYNC|MS_I_VERSION, "") = 0
>
> There's something really wrong.  The point Al has been making (and
> I've been making for multiple years) is that Syzbot has the
> information, but unfortunately, at the moment, it is only analyzing
> the the stack trace, and it is not doing things that really could be
> done automatically --- and cloud VM time is cheap, and upstream
> maintainer time is expensive.  So by not improving syzbot in a way
> that really shouldn't be all that difficult, the syzbot maintainers is
> disrespectiving the time of the upstream maintainers.
>
> So sure, we could ask Linus to triage all syzbot reports --- or we
> could ask Al to triage all syzbot file system reports --- but that is
> not a good use of upstream resources.
>
> And "we didn't know this is super annoying" isn't an excuse, because
> I've been asking for things like this *before* the COVID pandemic.  So
> if the Syzbot team won't listen to observations by a random Google
> engineer who happens to be an ext4 maintainer (or rather, I'm sure
> they were listening, but they didn't consider it important enough to
> staff and put on the roadmap), maybe something a bit
> more.... assertive by Al is something that will inspire them to
> prioritize this feature request "above the fold".  :-)
>
> And Al does have a point --- if a lot of upstream maintainers consider
> Syzbot reports to be less than useful, they will either auto-file
> reports to a junk folder, or just ignore the Syzbot reports because
> they are busy and the Probability(Usefulness) is close to zero, then
> recovering from that black eye to Syzbot's reputation is going to be a
> lot more difficult than if Syzbot was made more respectful of upstream
> maintainer time much earlier.
>
> Now, to be fair to the Syzbot team, the Syzbot console has gotten much
> better.  You can now download the syzbot trace, and download the
> mounted file system, when before, you had to do a lot more work to
> extract the file system (which is stored in separate constant C
> array's as compressed data) from the C reproducer.  So have things
> have gotten better.
>
> But at the same time, characterizing a syzbot report is something to
> be done by every file system maintainer who looks as a syzbot report,
> because there is no way to add a tag to the syzbot report that this
> particular syzbot report *really* is an ntfs3 issue.  So any
> information that a single developer figures out when triaging a bug
> (is this potentially an ext4 bug, nope, it's an ntfs3 bug) has to be
> replicated by every single kernel developer looking at the Syzbot
> dashboard.  Which again, is not respectful of upstream maintainers'
> time.

This is being worked on:
https://github.com/google/syzkaller/issues/3393#issuecomment-1330305227

Teaching a bot the pattern matching skills of a human is non-trivial.
The current design will likely do the simplest thing: regex match
reproducers and map a match to some kernel source dir, for which the
maintainers are Cc'd. If you have better suggestions on how to
mechanize subsystem selection based on a reproducer, please shout.

  reply	other threads:[~2022-12-12 19:29 UTC|newest]

Thread overview: 28+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
     [not found] <20221211002908.2210-1-hdanton@sina.com>
2022-12-11  2:30 ` [syzbot] WARNING in do_mkdirat syzbot
2022-12-11  2:52   ` Al Viro
2022-12-11  7:56     ` Hillf Danton
2022-12-11  8:39       ` Al Viro
2022-12-11 10:22         ` Hillf Danton
2022-12-11 15:46           ` Matthew Wilcox
2022-12-11 20:54             ` Al Viro
2022-12-12  3:29             ` Hillf Danton
2022-12-12 18:58               ` Theodore Ts'o
2022-12-12 19:29                 ` Marco Elver [this message]
2022-12-13  1:44                   ` Al Viro
2022-12-13  2:25                     ` Hillf Danton
2022-12-16 15:48                     ` Aleksandr Nogikh
2022-12-29 21:17                       ` Eric Biggers
2022-12-31 16:57                         ` Theodore Ts'o
2022-12-31 17:03                           ` Randy Dunlap
2023-01-03 13:36                           ` Aleksandr Nogikh
2022-12-13  1:47                 ` Hillf Danton
2022-12-13  3:36                   ` Al Viro
2022-12-13  4:12                     ` Hillf Danton
2022-12-13 11:05                       ` Alexander Potapenko
     [not found] <20221215235133.1097-1-hdanton@sina.com>
2022-12-16  7:53 ` syzbot
     [not found] <20221210011440.2050-1-hdanton@sina.com>
2022-12-10  7:24 ` syzbot
2022-12-03 14:52 syzbot
2022-12-04  1:04 ` Hillf Danton
2022-12-09 19:50 ` syzbot
2022-12-09 19:57   ` Matthew Wilcox
2022-12-10 18:06 ` syzbot

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=CANpmjNNCQEXpJt1PQptyr8mrBbhWpToCRfvUT+RXmw5EA5EwVw@mail.gmail.com \
    --to=elver@google.com \
    --cc=hdanton@sina.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=nogikh@google.com \
    --cc=syzbot+919c5a9be8433b8bf201@syzkaller.appspotmail.com \
    --cc=syzkaller-bugs@googlegroups.com \
    --cc=tytso@mit.edu \
    --cc=viro@zeniv.linux.org.uk \
    --cc=willy@infradead.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.