All of lore.kernel.org
 help / color / mirror / Atom feed
From: Jan Kara <jack@suse.cz>
To: Dmitry Vyukov <dvyukov@google.com>
Cc: Jan Kara <jack@suse.cz>,
	Tetsuo Handa <penguin-kernel@i-love.sakura.ne.jp>,
	Al Viro <viro@zeniv.linux.org.uk>,
	linux-fsdevel <linux-fsdevel@vger.kernel.org>,
	LKML <linux-kernel@vger.kernel.org>,
	"Paul E. McKenney" <paulmck@linux.ibm.com>,
	Alan Stern <stern@rowland.harvard.edu>,
	Andrea Parri <andrea.parri@amarulasolutions.com>
Subject: Re: [PATCH] fs: ratelimit __find_get_block_slow() failure message.
Date: Wed, 16 Jan 2019 15:51:30 +0100	[thread overview]
Message-ID: <20190116145130.GH26069@quack2.suse.cz> (raw)
In-Reply-To: <CACT4Y+ZtrBeZyNeSJ_9d3DdVuP21=h7TNnOZJ_wLhLu11+qAAA@mail.gmail.com>

On Wed 16-01-19 13:37:22, Dmitry Vyukov wrote:
> On Wed, Jan 16, 2019 at 12:56 PM Jan Kara <jack@suse.cz> wrote:
> >
> > On Wed 16-01-19 12:03:27, Dmitry Vyukov wrote:
> > > On Wed, Jan 16, 2019 at 11:43 AM Jan Kara <jack@suse.cz> wrote:
> > > >
> > > > On Wed 16-01-19 10:47:56, Dmitry Vyukov wrote:
> > > > > On Fri, Jan 11, 2019 at 1:46 PM Tetsuo Handa
> > > > > <penguin-kernel@i-love.sakura.ne.jp> wrote:
> > > > > >
> > > > > > On 2019/01/11 19:48, Dmitry Vyukov wrote:
> > > > > > >> How did you arrive to the conclusion that it is harmless?
> > > > > > >> There is only one relevant standard covering this, which is the C
> > > > > > >> language standard, and it is very clear on this -- this has Undefined
> > > > > > >> Behavior, that is the same as, for example, reading/writing random
> > > > > > >> pointers.
> > > > > > >>
> > > > > > >> Check out this on how any race that you might think is benign can be
> > > > > > >> badly miscompiled and lead to arbitrary program behavior:
> > > > > > >> https://software.intel.com/en-us/blogs/2013/01/06/benign-data-races-what-could-possibly-go-wrong
> > > > > > >
> > > > > > > Also there is no other practical definition of data race for automatic
> > > > > > > data race detectors than: two conflicting non-atomic concurrent
> > > > > > > accesses. Which this code is. Which means that if we continue writing
> > > > > > > such code we are not getting data race detection and don't detect
> > > > > > > thousands of races in kernel code that one may consider more harmful
> > > > > > > than this one the easy way. And instead will spent large amounts of
> > > > > > > time to fix some of then the hard way, and leave the rest as just too
> > > > > > > hard to debug so let the kernel continue crashing from time to time (I
> > > > > > > believe a portion of currently open syzbot bugs that developers just
> > > > > > > left as "I don't see how this can happen" are due to such races).
> > > > > > >
> > > > > >
> > > > > > I still cannot catch. Read/write of sizeof(long) bytes at naturally
> > > > > > aligned address is atomic, isn't it?
> > > > >
> > > > > Nobody guarantees this. According to C non-atomic conflicting
> > > > > reads/writes of sizeof(long) cause undefined behavior of the whole
> > > > > program.
> > > >
> > > > Yes, but to be fair the kernel has always relied on long accesses to be
> > > > atomic pretty heavily so that it is now de-facto standard for the kernel
> > > > AFAICT. I understand this makes life for static checkers hard but such is
> > > > reality.
> > >
> > > Yes, but nobody never defined what "a long access" means. And if you
> > > see a function that accepts a long argument and stores it into a long
> > > field, no, it does not qualify. I bet this will come at surprise to
> > > lots of developers.
> >
> > Yes, inlining and other optimizations can screw you.
> >
> > > Check out this fix and try to extrapolate how this "function stores
> > > long into a long leads to a serious security bug" can actually be
> > > applied to whole lot of places after inlining (or when somebody just
> > > slightly shuffles code in a way that looks totally safe) that also
> > > kinda look safe and atomic:
> > > https://lore.kernel.org/patchwork/patch/599779/
> > > So where is the boundary between "a long access" that is atomic and
> > > the one that is not necessary atomic?
> >
> > So I tend to rely on "long access being atomic" for opaque values (no
> > flags, no counters, ...). Just value that gets fetched from some global
> > variable / other data structure, stored, read, and possibly compared for
> > equality. I agree the compiler could still screw you if it could infer how
> > that value got initially created and try to be clever about it...
> 
> So can you, or somebody else, define a set of rules that we can use to
> discriminate each particular case? How can we avoid that "the compiler
> could still screw you"?
> 
> Inlining is always enabled, right, so one needs to take into account
> everything that's possibly can be inlined. Now or in future. And also
> link-time-code generation, if we don't use it we are dropping 10% of
> performance on the floor.
> Also, ensuring that the code works when it's first submitted is the
> smaller part of the problem. It's ensuring that it continues to work
> in future what's more important. Try to imagine what amount of burden
> this puts onto all developers who touch any kernel code in future.
> Basically if you slightly alter local logic in a function that does
> not do any loads/stores, you can screw multiple "proofs" that long
> accesses are atomic. Or, you just move a function from .c file to .h.
> I can bet nobody re-proofs all "long accesses are atomic" around the
> changed code during code reviews, so these things break over time.
> Or, even if only comparisons are involved (that you mentioned as
> "safe") I see how that can actually affect compilation process. Say,
> we are in the branch where 2 variables compare equal, now since no
> concurrency is involved from compiler point of view, it can, say,
> discard one variable and then re-load it from the other variable's
> location, and say not the other variable has value that the other one
> must never have. I don't have a full scenario, but that's exactly the
> point. One will never see all possibilities.
> 
> It all becomes super slippery slope very quickly. And we do want
> compiler to generate as fast code as possible and do all these
> optimizations. And it's not that there are big objective reasons to
> not just mark all concurrent accesses and stop spending large amounts
> of time on these "proofs".

I guess you've convinced me that somehow marking such accesses is
desirable. So is using atomic_long_t and atomic_long_set()
/ atomic_long_read() for manipulation instead what you suggest?

								Honza
-- 
Jan Kara <jack@suse.com>
SUSE Labs, CR

  reply	other threads:[~2019-01-16 14:51 UTC|newest]

Thread overview: 27+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2019-01-11 10:10 [PATCH] fs: ratelimit __find_get_block_slow() failure message Tetsuo Handa
2019-01-11 10:19 ` Dmitry Vyukov
     [not found]   ` <04c6d87c-fc26-b994-3b34-947414984abe@i-love.sakura.ne.jp>
2019-01-11 10:40     ` Dmitry Vyukov
2019-01-11 10:48       ` Dmitry Vyukov
2019-01-11 12:46         ` Tetsuo Handa
2019-01-16  9:47           ` Dmitry Vyukov
2019-01-16 10:43             ` Jan Kara
2019-01-16 11:03               ` Dmitry Vyukov
2019-01-16 11:48                 ` Dmitry Vyukov
2019-01-16 16:28                   ` Greg Kroah-Hartman
2019-01-17 13:18                     ` Dmitry Vyukov
2019-01-21  8:37                       ` Jan Kara
2019-01-21 10:33                         ` Tetsuo Handa
2019-01-21 10:48                           ` Greg Kroah-Hartman
2019-01-22 15:27                         ` Kernel development process (was: [PATCH] fs: ratelimit __find_get_block_slow() failure message.) Dmitry Vyukov
2019-01-22 17:15                           ` Jan Kara
2019-01-16 11:56                 ` [PATCH] fs: ratelimit __find_get_block_slow() failure message Jan Kara
2019-01-16 12:37                   ` Dmitry Vyukov
2019-01-16 14:51                     ` Jan Kara [this message]
2019-01-16 15:33                       ` Dmitry Vyukov
2019-01-16 16:15                         ` Paul E. McKenney
2019-01-17 14:11               ` Tetsuo Handa
2019-01-18 15:30                 ` Dmitry Vyukov
2019-01-11 10:51       ` Tetsuo Handa
2019-01-11 11:03 ` Jan Kara
2019-01-11 11:37   ` [PATCH v2] " Tetsuo Handa
2019-01-21  8:57     ` Jan Kara

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20190116145130.GH26069@quack2.suse.cz \
    --to=jack@suse.cz \
    --cc=andrea.parri@amarulasolutions.com \
    --cc=dvyukov@google.com \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=paulmck@linux.ibm.com \
    --cc=penguin-kernel@i-love.sakura.ne.jp \
    --cc=stern@rowland.harvard.edu \
    --cc=viro@zeniv.linux.org.uk \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.