All of lore.kernel.org
 help / color / mirror / Atom feed
From: Qian Cai <cai@lca.pw>
To: Marco Elver <elver@google.com>, John Hubbard <jhubbard@nvidia.com>
Cc: Jan Kara <jack@suse.cz>, David Hildenbrand <david@redhat.com>,
	Andrew Morton <akpm@linux-foundation.org>,
	ira.weiny@intel.com, Dan Williams <dan.j.williams@intel.com>,
	Linux Memory Management List <linux-mm@kvack.org>,
	Linux Kernel Mailing List <linux-kernel@vger.kernel.org>,
	"Paul E. McKenney" <paulmck@kernel.org>,
	kasan-dev <kasan-dev@googlegroups.com>
Subject: Re: [PATCH] mm: fix a data race in put_page()
Date: Mon, 10 Feb 2020 11:23:09 -0500	[thread overview]
Message-ID: <1581351789.7365.32.camel@lca.pw> (raw)
In-Reply-To: <CANpmjNNaHAnKCMLb+Njs3AhEoJT9O6-Yh63fcNcVTjBbNQiEPg@mail.gmail.com>

On Mon, 2020-02-10 at 08:48 +0100, Marco Elver wrote:
> On Sun, 9 Feb 2020 at 08:15, John Hubbard <jhubbard@nvidia.com> wrote:
> > 
> > On 2/8/20 7:10 PM, Qian Cai wrote:
> > > 
> > > 
> > > > On Feb 8, 2020, at 8:44 PM, John Hubbard <jhubbard@nvidia.com> wrote:
> > > > 
> > > > So it looks like we're probably stuck with having to annotate the code. Given
> > > > that, there is a balance between how many macros, and how much commenting. For
> > > > example, if there is a single macro (data_race, for example), then we'll need to
> > > > add comments for the various cases, explaining which data_race situation is
> > > > happening.
> > > 
> > > On the other hand, it is perfect fine of not commenting on each data_race() that most of times, people could run git blame to learn more details. Actually, no maintainers from various of subsystems asked for commenting so far.
> > > 
> > 
> > Well, maybe I'm looking at this wrong. I was thinking that one should attempt to
> > understand the code on the screen, and that's generally best--but here, maybe
> > "data_race" is just something that means "tool cruft", really. So mentally we
> > would move toward visually filtering out the data_race "key word".
> 
> One thing to note is that 'data_race()' points out concurrency, and
> that somebody has deemed that the code won't break even with data
> races. Somebody trying to understand or modify the code should ensure
> this will still be the case. So, 'data_race()' isn't just tool cruft.
> It's documentation for something that really isn't obvious from the
> code alone.
> 
> Whenever we see a READ_ONCE or other marked access it is obvious to
> the reader that there are concurrent accesses happening.  I'd argue
> that for intentional data races, we should convey similar information,
> to avoid breaking the code (of course KCSAN would tell you, but only
> after the change was done). Even moreso, since changes to code
> involving 'data_race()' will need re-verification that the data races
> are still safe.
> 
> > I really don't like it but at least there is a significant benefit from the tool
> > that probably makes it worth the visual noise.
> > 
> > Blue sky thoughts for The Far Future: It would be nice if the tools got a lot
> > better--maybe in the direction of C language extensions, even if only used in
> > this project at first.
> 
> Still thinking about this.  What we want to convey is that, while
> there are races on the particular variable, nobody should be modifying
> the bits here. Adding a READ_ONCE (or data_race()) would miss a
> harmful race where somebody modifies these bits, so in principle I
> agree. However, I think the tool can't automatically tell (even if we
> had compiler extensions to give us the bits accessed) which bits we
> care about, because we might have something like:
> 
> int foo_bar = READ_ONCE(flags) >> FOO_BAR_SHIFT;  // need the
> READ_ONCE because of FOO bits
> .. (foo_bar & FOO_MASK) ..  // FOO bits can be modified concurrently
> .. (foo_bar & BAR_MASK) ..  // nobody should modify BAR bits
> concurrently though !
> 
> What we want is to assert that nobody touches a particular set of
> bits. KCSAN has recently gotten ASSERT_EXCLUSIVE_{WRITER,ACCESS}
> macros which help assert properties of concurrent code, where bugs
> won't manifest as data races. Along those lines, I can see the value
> in doing an exclusivity check on a bitmask of a variable.
> 
> I don't know how much a READ_BITS macro could help, since it's
> probably less ergonomic to have to say something like:
>   READ_BITS(page->flags, ZONES_MASK << ZONES_PGSHIFT) >> ZONES_PGSHIFT.
> 
> Here is an alternative:
> 
> Let's say KCSAN gives you this:
>    /* ... Assert that the bits set in mask are not written
> concurrently; they may still be read concurrently.
>      The access that immediately follows is assumed to access those
> bits and safe w.r.t. data races.
> 
>      For example, this may be used when certain bits of @flags may
> only be modified when holding the appropriate lock,
>      but other bits may still be modified locklessly.
>    ...
>   */
>    #define ASSERT_EXCLUSIVE_BITS(flags, mask)   ....
> 
> Then we can write page_zonenum as follows:
> 
> static inline enum zone_type page_zonenum(const struct page *page)
>  {
> +       ASSERT_EXCLUSIVE_BITS(page->flags, ZONES_MASK << ZONES_PGSHIFT);
>         return (page->flags >> ZONES_PGSHIFT) & ZONES_MASK;
>  }

Actually, it seems still need to write if I understand correctly,

ASSERT_EXCLUSIVE_BITS(page->flags, ZONES_MASK << ZONES_PGSHIFT);
return data_race((page->flags >> ZONES_PGSHIFT) & ZONES_MASK);

On the other hand, if you really worry about this thing could go wrong, it might
be better of using READ_ONCE() at the first place where it will be more future-
proof with the trade-off it might generate less efficient code optimization?

Alternatively, is there a way to write this as this?

return ASSERT_EXCLUSIVE_BITS(page->flags, ZONES_MASK << ZONES_PGSHIFT);

Kind of ugly but it probably cleaner.

> 
> This will accomplish the following:
> 1. The current code is not touched, and we do not have to verify that
> the change is correct without KCSAN.
> 2. We're not introducing a bunch of special macros to read bits in various ways.
> 3. KCSAN will assume that the access is safe, and no data race report
> is generated.
> 4. If somebody modifies ZONES bits concurrently, KCSAN will tell you
> about the race.
> 5. We're documenting the code.
> 
> Anything I missed?
> 
> Thanks,
> -- Marco
> 
> 
> 
> 
> 
> > thanks,
> > --
> > John Hubbard
> > NVIDIA
> > 
> > > > 
> > > > That's still true, but to a lesser extent if more macros are added. In this case,
> > > > I suspect that READ_BITS() makes the commenting easier and shorter. So I'd tentatively
> > > > lead towards adding it, but what do others on the list think?
> > > 
> > > Even read bits could be dangerous from data races and confusing at best, so I am not really sure what the value of introducing this new macro. People who like to understand it correctly still need to read the commit logs.
> > > 
> > > This flags->zonenum is such a special case that I don’t really see it regularly for the last few weeks digging KCSAN reports, so even if it is worth adding READ_BITS(), there are more equally important macros need to be added together to be useful initially. For example, HARMLESS_COUNTERS(), READ_SINGLE_BIT(), READ_IMMUTATABLE_BITS() etc which Linus said exactly wanted to avoid.
> > > 

  parent reply	other threads:[~2020-02-10 16:23 UTC|newest]

Thread overview: 34+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-02-06 13:17 [PATCH] mm: fix a data race in put_page() Qian Cai
2020-02-06 13:33 ` David Hildenbrand
2020-02-06 13:51   ` Qian Cai
2020-02-06 13:51     ` Qian Cai
2020-02-06 14:55   ` Jan Kara
2020-02-06 14:59     ` David Hildenbrand
2020-02-06 15:23     ` Qian Cai
2020-02-06 23:34       ` John Hubbard
2020-02-06 23:36         ` John Hubbard
2020-02-06 23:55         ` Qian Cai
2020-02-07  0:18         ` Qian Cai
2020-02-07  0:27           ` John Hubbard
2020-02-07  0:55             ` Qian Cai
2020-02-07 13:17               ` Marco Elver
2020-02-07 13:17                 ` Marco Elver
2020-02-09  1:44                 ` John Hubbard
2020-02-09  3:10                   ` Qian Cai
2020-02-09  7:12                     ` John Hubbard
2020-02-10  7:48                       ` Marco Elver
2020-02-10  7:48                         ` Marco Elver
2020-02-10 12:16                         ` Qian Cai
2020-02-10 12:58                           ` Marco Elver
2020-02-10 12:58                             ` Marco Elver
2020-02-10 13:36                             ` Qian Cai
2020-02-10 13:38                               ` Marco Elver
2020-02-10 13:38                                 ` Marco Elver
2020-02-10 13:55                                 ` Qian Cai
2020-02-10 14:12                                   ` Marco Elver
2020-02-10 14:12                                     ` Marco Elver
2020-02-10 14:31                                     ` Qian Cai
2020-02-10 16:23                         ` Qian Cai [this message]
2020-02-10 16:23                           ` Qian Cai
2020-02-10 16:33                           ` Marco Elver
2020-02-10 16:33                             ` Marco Elver

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1581351789.7365.32.camel@lca.pw \
    --to=cai@lca.pw \
    --cc=akpm@linux-foundation.org \
    --cc=dan.j.williams@intel.com \
    --cc=david@redhat.com \
    --cc=elver@google.com \
    --cc=ira.weiny@intel.com \
    --cc=jack@suse.cz \
    --cc=jhubbard@nvidia.com \
    --cc=kasan-dev@googlegroups.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=paulmck@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.