All of lore.kernel.org
 help / color / mirror / Atom feed
From: Al Viro <viro@zeniv.linux.org.uk>
To: Marco Elver <elver@google.com>
Cc: Theodore Ts'o <tytso@mit.edu>, Hillf Danton <hdanton@sina.com>,
	Matthew Wilcox <willy@infradead.org>,
	syzbot <syzbot+919c5a9be8433b8bf201@syzkaller.appspotmail.com>,
	linux-kernel@vger.kernel.org, linux-mm@kvack.org,
	syzkaller-bugs@googlegroups.com,
	Aleksandr Nogikh <nogikh@google.com>
Subject: Re: [syzbot] WARNING in do_mkdirat
Date: Tue, 13 Dec 2022 01:44:08 +0000	[thread overview]
Message-ID: <Y5fY6BRTB9OfwFU0@ZenIV> (raw)
In-Reply-To: <CANpmjNNCQEXpJt1PQptyr8mrBbhWpToCRfvUT+RXmw5EA5EwVw@mail.gmail.com>

On Mon, Dec 12, 2022 at 08:29:10PM +0100, Marco Elver wrote:

> > > Given the call trace above, how do you know the ntfs3 guys should be also
> > > Cced in addition to AV? What if it would take more than three months for
> > > syzbot to learn the skills in your mind?

Depends.  If you really are talking about the *BOT* learning to do
that on its own, it certainly would take more than 3 months; strong AI
is hard.  If, OTOH, it is not an AI research project and intervention of
somebody capable of passing the Turing test does not violate the purity
of experiment...  Surely converting "if it mounts an image as filesystem
of type $T, grep the tree for "MODULE_ALIAS_FS($T)" and treat that
as if a function from the resulting file had been found in stack trace"
into something usable for the bot should not take more than 3 months,
should it?

If expressing that rule really takes "more than three months", I would
suggest that something is very wrong with the bot architecture...

> Teaching a bot the pattern matching skills of a human is non-trivial.
> The current design will likely do the simplest thing: regex match
> reproducers and map a match to some kernel source dir, for which the
> maintainers are Cc'd. If you have better suggestions on how to
> mechanize subsystem selection based on a reproducer, please shout.

Er...  Yes?  Look, it's really that simple -
for i in `sed -ne 's/.*syz_mount_image$\([_[:alnum:]]*\).*/\1/p' <$REPRO`; do
	git grep -l "MODULE_ALIAS_FS(\"$i\")"
done | sort | uniq
gets you the list of files.  No, I'm not suggesting to go for that kind
of shell use, but it's clearly doable with regex and search over the source
for fixed strings.  Unless something's drastically wrong with the way the
bot is written, it should be capable of something as basic as that...

If it can't do that kind of mapping, precalculating it for given tree is
also not hard:
git grep 'MODULE_ALIAS_FS("'|sed -ne 's/\(.*\):.*MODULE_ALIAS_FS("\([_[:alnum:]]*\)".*/syz_mount_image$\2:\1/p'
will yield lines like
syz_mount_image$ext2:fs/ext2/super.c
syz_mount_image$ext2:fs/ext4/super.c
syz_mount_image$ext3:fs/ext4/super.c
syz_mount_image$ext4:fs/ext4/super.c
etc.  Surely turning *that* into whatever form the bot wants can't
be terribly hard? [*]

All of that assumes that pattern-matching in syzkaller reproducer is
expressible; if "we must do everything by call trace alone" is
a real limitation, we are SOL; stack trace simply doesn't have
that information.  Is there such an architectural limitation?

[*] depending upon config, ext2 could be mounted by ext2.ko and ext4.ko;
both have the same maillist for bug reports, so this ambiguity doesn't
matter - either match would do.

  reply	other threads:[~2022-12-13  1:44 UTC|newest]

Thread overview: 28+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
     [not found] <20221211002908.2210-1-hdanton@sina.com>
2022-12-11  2:30 ` [syzbot] WARNING in do_mkdirat syzbot
2022-12-11  2:52   ` Al Viro
2022-12-11  7:56     ` Hillf Danton
2022-12-11  8:39       ` Al Viro
2022-12-11 10:22         ` Hillf Danton
2022-12-11 15:46           ` Matthew Wilcox
2022-12-11 20:54             ` Al Viro
2022-12-12  3:29             ` Hillf Danton
2022-12-12 18:58               ` Theodore Ts'o
2022-12-12 19:29                 ` Marco Elver
2022-12-13  1:44                   ` Al Viro [this message]
2022-12-13  2:25                     ` Hillf Danton
2022-12-16 15:48                     ` Aleksandr Nogikh
2022-12-29 21:17                       ` Eric Biggers
2022-12-31 16:57                         ` Theodore Ts'o
2022-12-31 17:03                           ` Randy Dunlap
2023-01-03 13:36                           ` Aleksandr Nogikh
2022-12-13  1:47                 ` Hillf Danton
2022-12-13  3:36                   ` Al Viro
2022-12-13  4:12                     ` Hillf Danton
2022-12-13 11:05                       ` Alexander Potapenko
     [not found] <20221215235133.1097-1-hdanton@sina.com>
2022-12-16  7:53 ` syzbot
     [not found] <20221210011440.2050-1-hdanton@sina.com>
2022-12-10  7:24 ` syzbot
2022-12-03 14:52 syzbot
2022-12-04  1:04 ` Hillf Danton
2022-12-09 19:50 ` syzbot
2022-12-09 19:57   ` Matthew Wilcox
2022-12-10 18:06 ` syzbot

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=Y5fY6BRTB9OfwFU0@ZenIV \
    --to=viro@zeniv.linux.org.uk \
    --cc=elver@google.com \
    --cc=hdanton@sina.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=nogikh@google.com \
    --cc=syzbot+919c5a9be8433b8bf201@syzkaller.appspotmail.com \
    --cc=syzkaller-bugs@googlegroups.com \
    --cc=tytso@mit.edu \
    --cc=willy@infradead.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.