From: Marco Elver <elver@google.com>
To: Michal Hocko <mhocko@kernel.org>
Cc: Qian Cai <cai@lca.pw>, Andrew Morton <akpm@linux-foundation.org>,
Johannes Weiner <hannes@cmpxchg.org>,
Linux Memory Management List <linux-mm@kvack.org>,
LKML <linux-kernel@vger.kernel.org>
Subject: Re: [PATCH -next] mm/page_counter: mark intentional data races
Date: Wed, 29 Jan 2020 10:06:40 +0100 [thread overview]
Message-ID: <CANpmjNNaCtL+vqpPKug9_DoFUue=PdoTyQFXLOx5H_BYCyDMzA@mail.gmail.com> (raw)
In-Reply-To: <20200129085124.GF24244@dhcp22.suse.cz>
On Wed, 29 Jan 2020 at 09:51, Michal Hocko <mhocko@kernel.org> wrote:
>
> On Tue 28-01-20 23:20:19, Qian Cai wrote:
> > The commit 3e32cb2e0a12 ("mm: memcontrol: lockless page counters")
> > had memcg->memsw->failcnt and ->watermark could be accessed concurrently
> > as reported by KCSAN,
> >
> > Reported by Kernel Concurrency Sanitizer on:
> > BUG: KCSAN: data-race in page_counter_try_charge / page_counter_try_charge
> >
> > read to 0xffff8fb18c4cd190 of 8 bytes by task 1081 on cpu 59:
> > page_counter_try_charge+0x4d/0x150 mm/page_counter.c:138
> > try_charge+0x131/0xd50
Why are the line numbers for the remaining symbols missing? Doesn't
scripts/decode_stacktrace.sh give you all line numbers?
[ As an aside: if you want to use what syzbot uses to put line numbers
on symbols, which is a bit faster:
https://github.com/google/syzkaller/tree/master/tools/syz-symbolize
https://github.com/google/syzkaller/blob/master/docs/linux/setup.md
then 'go build tools/syz-symbolize'. ]
> > __memcg_kmem_charge_memcg+0x58/0x140
> > __memcg_kmem_charge+0xcc/0x280
> > __alloc_pages_nodemask+0x1e1/0x450
> > alloc_pages_current+0xa6/0x120
> > pte_alloc_one+0x17/0xd0
> > __pte_alloc+0x3a/0x1f0
> > copy_p4d_range+0xc36/0x1990
> > copy_page_range+0x21d/0x360
> > dup_mmap+0x5f5/0x7a0
> > dup_mm+0xa2/0x240
> > copy_process+0x1b3f/0x3460
> > _do_fork+0xaa/0xa20
> > __x64_sys_clone+0x13b/0x170
> > do_syscall_64+0x91/0xb47
> > entry_SYSCALL_64_after_hwframe+0x49/0xbe
> >
> > write to 0xffff8fb18c4cd190 of 8 bytes by task 1153 on cpu 120:
> > page_counter_try_charge+0x5b/0x150 mm/page_counter.c:139
> > try_charge+0x131/0xd50
> > mem_cgroup_try_charge+0x159/0x460
> > mem_cgroup_try_charge_delay+0x3d/0xa0
> > wp_page_copy+0x14d/0x930
> > do_wp_page+0x107/0x7b0
> > __handle_mm_fault+0xce6/0xd40
> > handle_mm_fault+0xfc/0x2f0
> > do_page_fault+0x263/0x6f9
> > page_fault+0x34/0x40
> >
> > Since the failcnt and watermark are tolerant of some inaccuracy, a data
> > race will not be harmful, thus mark them as intentional data races with
> > the data_race() macro.
>
> I am not familiar with KCSAN and git grep for data_race on the current
> linux-next doesn't really show any users of this macro. Is there a
> general consensus that data_race is going to be used to silence all
> KCSAN false positives?
It was discussed here:
https://lore.kernel.org/linux-fsdevel/CAHk-=wg5CkOEF8DTez1Qu0XTEFw_oHhxN98bDnFqbY7HL5AB2g@mail.gmail.com/
If they're intentional data races that should remain, data_race() is
one option. There are 4 options (other than address the data race) to
deal with 'false positives':
https://git.kernel.org/pub/scm/linux/kernel/git/next/linux-next.git/tree/Documentation/dev-tools/kcsan.rst#n101
That being said, every use of data_race() needs to be justified, and
not just applied without understanding the issue. See below.
> > Signed-off-by: Qian Cai <cai@lca.pw>
> > ---
> > mm/page_counter.c | 10 +++++-----
> > 1 file changed, 5 insertions(+), 5 deletions(-)
> >
> > diff --git a/mm/page_counter.c b/mm/page_counter.c
> > index de31470655f6..13934636eafd 100644
> > --- a/mm/page_counter.c
> > +++ b/mm/page_counter.c
> > @@ -82,8 +82,8 @@ void page_counter_charge(struct page_counter *counter, unsigned long nr_pages)
> > * This is indeed racy, but we can live with some
> > * inaccuracy in the watermark.
> > */
> > - if (new > c->watermark)
> > - c->watermark = new;
> > + if (data_race(new > c->watermark))
> > + data_race(c->watermark = new);
These should be using 'READ_ONCE' and 'WRITE_ONCE' for c->watermark.
Store or load tearing would change the logic here, since the
comparison might see garbage.
> > }
> > }
> >
> > @@ -126,7 +126,7 @@ bool page_counter_try_charge(struct page_counter *counter,
> > * This is racy, but we can live with some
> > * inaccuracy in the failcnt.
> > */
> > - c->failcnt++;
> > + data_race(c->failcnt++);
This is probably fine.
> > *fail = c;
> > goto failed;
> > }
> > @@ -135,8 +135,8 @@ bool page_counter_try_charge(struct page_counter *counter,
> > * Just like with failcnt, we can live with some
> > * inaccuracy in the watermark.
> > */
> > - if (new > c->watermark)
> > - c->watermark = new;
> > + if (data_race(new > c->watermark))
> > + data_race(c->watermark = new);
This should be READ_ONCE / WRITE_ONCE.
> > }
> > return true;
> >
> > --
> > 2.21.0 (Apple Git-122.2)
>
> --
> Michal Hocko
> SUSE Labs
next prev parent reply other threads:[~2020-01-29 9:07 UTC|newest]
Thread overview: 5+ messages / expand[flat|nested] mbox.gz Atom feed top
2020-01-29 4:20 [PATCH -next] mm/page_counter: mark intentional data races Qian Cai
2020-01-29 8:51 ` Michal Hocko
2020-01-29 9:06 ` Marco Elver [this message]
2020-01-29 9:33 ` Qian Cai
2020-01-29 9:51 ` Qian Cai
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to='CANpmjNNaCtL+vqpPKug9_DoFUue=PdoTyQFXLOx5H_BYCyDMzA@mail.gmail.com' \
--to=elver@google.com \
--cc=akpm@linux-foundation.org \
--cc=cai@lca.pw \
--cc=hannes@cmpxchg.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=mhocko@kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).