From mboxrd@z Thu Jan  1 00:00:00 1970
Date: Thu, 18 Jan 2018 15:05:37 +0100
From: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
To: Dmitry Vyukov <dvyukov@google.com>
Cc: Theodore Ts'o <tytso@mit.edu>,
	Alexei Starovoitov <alexei.starovoitov@gmail.com>,
	Daniel Borkmann <daniel@iogearbox.net>, Pavel Machek <pavel@ucw.cz>,
	LKML <linux-kernel@vger.kernel.org>,
	netdev <netdev@vger.kernel.org>, syzkaller-bugs@googlegroups.com,
	Guenter Roeck <groeck@google.com>
Subject: Re: dangers of bots on the mailing lists was Re: divide error in
 ___bpf_prog_run
Message-ID: <20180118140537.GA30059@kroah.com>
References: <001a11405130ff1e9705629eb53c@google.com>
 <20180117093225.GB20303@amd>
 <ea809ab2-51f8-9094-efe2-11acf53458c0@iogearbox.net>
 <CACT4Y+YOzrzdGFswy_zp=XOUSKKNebdOJcMHC=SYASRGj3b7FA@mail.gmail.com>
 <20180117204735.GC6948@thunk.org>
 <20180118002111.b7ejjd2adunmkooj@ast-mbp.dhcp.thefacebook.com>
 <20180118010930.GE6948@thunk.org>
 <CACT4Y+YxbfkwSKagzhcDFff4Yq_xOVgudbkhdJ=kZcE2OYVD3Q@mail.gmail.com>
MIME-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
In-Reply-To: <CACT4Y+YxbfkwSKagzhcDFff4Yq_xOVgudbkhdJ=kZcE2OYVD3Q@mail.gmail.com>
User-Agent: Mutt/1.9.2 (2017-12-15)
X-Mailing-List: linux-kernel@vger.kernel.org
List-ID: <linux-kernel.vger.kernel.org>

On Thu, Jan 18, 2018 at 02:01:28PM +0100, Dmitry Vyukov wrote:
> On Thu, Jan 18, 2018 at 2:09 AM, Theodore Ts'o <tytso@mit.edu> wrote:
> > On Wed, Jan 17, 2018 at 04:21:13PM -0800, Alexei Starovoitov wrote:
> >>
> >> If syzkaller can only test one tree than linux-next should be the one.
> >
> > Well, there's been some controversy about that.  The problem is that
> > it's often not clear if this is long-standing bug, or a bug which is
> > in a particular subsystem tree --- and if so, *which* subsystem tree,
> > etc.  So it gets blasted to linux-kernel, and to get_maintainer.pl,
> > which is often not accurate --- since the location of the crash
> > doesn't necessarily point out where the problem originated, and hence
> > who should look at the syzbot report.  And so this has caused
> > some.... irritation.
> 
> 
> Re set of tested trees.
> 
> We now have an interesting spectrum of opinions.
> 
> Some assorted thoughts on this:
> 
> 1. First, "upstream is clean" won't happen any time soon. There are
> several reasons for this:
>  - Currently syzkaller only tests a subset of subsystems that it knows
> how to test, even the ones that it tests it tests poorly. Over time
> it's improved to test most subsystems and existing subsystems better.
> Just few weeks ago I've added some descriptions for crypto subsystem
> and it uncovered 20+ old bugs.
>  - syzkaller is guided, genetic fuzzer over time it leans how to do
> more complex things by small steps. It takes time.
>  - We have more bug detection tools coming: LEAKCHECK, KMSAN (uninit
> memory), KTSAN (data races).
>  - generic syzkaller smartness will be improved over time.
>  - it will get more CPU resources.
> Effect of all of these things is multiplicative: we test more code,
> smarter, with more bug-detection tools, with more resources. So I
> think we need to plan for a mix of old and new bugs for foreseeable
> future.

That's fine, but when you test Linus's tree, we "know" you are hitting
something that really is an issue, and it's not due to linux-next
oddities.

When I see a linux-next report, and it looks "odd", my default reaction
is "ugh, must be a crazy patch in some other subsystem, I _know_ my code
in linux-next is just fine." :)

> 2. get_maintainer.pl and mix of old and new bugs was mentioned as
> harming attribution. I don't see what will change when/if we test only
> upstream. Then the same mix of old/new bugs will be detected just on
> upstream, with all of the same problems for old/new, maintainers,
> which subsystem, etc. I think the amount of bugs in the kernel is
> significant part of the problem, but the exact boundary where we
> decide to start killing them won't affect number of bugs.

I don't worry about that, the traceback should tell you a lot, and even
when that is wrong (i.e. warnings thrown up by sysfs core calls that are
obviously not a sysfs issue, but rather a subsystem issue), it's easy to
see.

> 3. If we test only upstream, we increase chances of new security bugs
> sinking into releases. We sure could raise perceived security value of
> the bugs by keeping them private, letting them sink into release,
> letting them sink into distros, and then reporting a high-profile
> vulnerability. I think that's wrong. There is something broken with
> value measuring in security community. Bug that is killed before
> sinking into any release is the highest impact thing. As Alexei noted,
> fixing bugs es early as possible also reduces fix costs, backporting
> burden, etc. This also can eliminate need in bisection in some cases,
> say if you accepted a large change to some files and a bunch of
> crashes appears for these files on your tree soon, it's obvious what
> happens.

I agree, this is an issue, but I think you have a lot of "low hanging
fruit" in Linus's tree left to find.  Testing linux-next is great, but
the odds of something "new" being added there for your type of testing
right now is usually pretty low, right?

thanks,

greg k-h