Re: [PATCH v3 08/25] kcsan: Show location access was reordered to

From: Boqun Feng <boqun.feng@gmail.com>
To: Marco Elver <elver@google.com>
Cc: "Paul E. McKenney" <paulmck@kernel.org>,
	Alexander Potapenko <glider@google.com>,
	Borislav Petkov <bp@alien8.de>,
	Dmitry Vyukov <dvyukov@google.com>,
	Ingo Molnar <mingo@kernel.org>,
	Mark Rutland <mark.rutland@arm.com>,
	Peter Zijlstra <peterz@infradead.org>,
	Thomas Gleixner <tglx@linutronix.de>,
	Waiman Long <longman@redhat.com>, Will Deacon <will@kernel.org>,
	kasan-dev@googlegroups.com, linux-arch@vger.kernel.org,
	linux-doc@vger.kernel.org, linux-kbuild@vger.kernel.org,
	linux-kernel@vger.kernel.org, linux-mm@kvack.org,
	llvm@lists.linux.dev, x86@kernel.org
Subject: Re: [PATCH v3 08/25] kcsan: Show location access was reordered to
Date: Mon, 6 Dec 2021 22:31:24 +0800	[thread overview]
Message-ID: <Ya4evHE7uQ9eXpax@boqun-archlinux> (raw)
In-Reply-To: <CANpmjNMirKGSBW2m+bWRM9_FnjK3_HjnJC=dhyMktx50mwh1GQ@mail.gmail.com>

On Mon, Dec 06, 2021 at 08:16:11AM +0100, Marco Elver wrote:
> On Mon, 6 Dec 2021 at 06:04, Boqun Feng <boqun.feng@gmail.com> wrote:
> >
> > Hi,
> >
> > On Tue, Nov 30, 2021 at 12:44:16PM +0100, Marco Elver wrote:
> > > Also show the location the access was reordered to. An example report:
> > >
> > > | ==================================================================
> > > | BUG: KCSAN: data-race in test_kernel_wrong_memorder / test_kernel_wrong_memorder
> > > |
> > > | read-write to 0xffffffffc01e61a8 of 8 bytes by task 2311 on cpu 5:
> > > |  test_kernel_wrong_memorder+0x57/0x90
> > > |  access_thread+0x99/0xe0
> > > |  kthread+0x2ba/0x2f0
> > > |  ret_from_fork+0x22/0x30
> > > |
> > > | read-write (reordered) to 0xffffffffc01e61a8 of 8 bytes by task 2310 on cpu 7:
> > > |  test_kernel_wrong_memorder+0x57/0x90
> > > |  access_thread+0x99/0xe0
> > > |  kthread+0x2ba/0x2f0
> > > |  ret_from_fork+0x22/0x30
> > > |   |
> > > |   +-> reordered to: test_kernel_wrong_memorder+0x80/0x90
> > > |
> >
> > Should this be "reordered from" instead of "reordered to"? For example,
> > if the following case needs a smp_mb() between write to A and write to
> > B, I think currently it will report as follow:
> >
> >         foo() {
> >                 WRITE_ONCE(A, 1); // let's say A's address is 0xaaaa
> >                 bar() {
> >                         WRITE_ONCE(B, 1); // Assume B's address is 0xbbbb
> >                                           // KCSAN find the problem here
> >                 }
> >         }
> >
> >         <report>
> >         | write (reordered) to 0xaaaa of ...:
> >         | bar+0x... // address of the write to B
> >         | foo+0x... // address of the callsite to bar()
> >         | ...
> >         |  |
> >         |  +-> reordered to: foo+0x... // address of the write to A
> >
> > But since the access reported here is the write to A, so it's a
> > "reordered from" instead of "reordered to"?
> 
> Perhaps I could have commented on this in the commit message to avoid
> the confusion, but per its updated comment replace_stack_entry()
> "skips to the first entry that matches the function of @ip, and then
> replaces that entry with @ip, returning the entries to skip with
> @replaced containing the replaced entry."
> 
> When a reorder_access is set up, the ip to it is stored, which is
> what's passed to @ip of replace_stack_entry(). It effectively swaps
> the top frame where the race occurred with where the original access
> happened. This all works because the runtime is careful to only keep
> reorder_accesses valid until the original function where it occurred
> is left.
> 

Thanks for the explanation, I was missing the swap here. However...

> So in your above example you need to swap "reordered to" and the top
> frame of the stack trace.
> 

IIUC, the report for my above example will be:

         | write (reordered) to 0xaaaa of ...:
         | foo+0x... // address of the write to A
         | ...
         |  |
         |  +-> reordered to: foo+0x... // address of the callsite to bar() in foo()

, right? Because in replace_stack_entry(), it's not the top frame where
the race occurred that gets swapped, it's the frame which belongs to the
same function as the original access that gets swapped. In other words,
when KCSAN finds the problem, top entries of the calling stack are:

	[0] bar+0x.. // address of the write to B
	[1] foo+0x.. // address of the callsite to bar() in foo()

after replace_stack_entry(), they changes to:

	[0] bar+0x.. // address of the write to B
skip  ->[1] foo+0x.. // address of the write to A

, as a result the report won't mention bar() at all.

And I think a better report will be:

         | write (reordered) to 0xaaaa of ...:
         | foo+0x... // address of the write to A
         | ...
         |  |
         |  +-> reordered to: bar+0x... // address of the write to B in bar()

because it tells users the exact place the accesses get reordered. That
means maybe we want something as below? Not completely tested, but I
play with scope checking a bit, seems it gives what I want. Thoughts?

Regards,
Boqun

diff --git a/kernel/kcsan/report.c b/kernel/kcsan/report.c
index 67794404042a..b495ed3aa637 100644
--- a/kernel/kcsan/report.c
+++ b/kernel/kcsan/report.c
@@ -324,7 +324,10 @@ replace_stack_entry(unsigned long stack_entries[], int num_entries, unsigned lon
        else
                goto fallback;

-       for (skip = 0; skip < num_entries; ++skip) {
+       skip = get_stack_skipnr(stack_entries, num_entries);
+       *replaced = stack_entries[skip];
+
+       for (;skip < num_entries; ++skip) {
                unsigned long func = stack_entries[skip];

                if (!kallsyms_lookup_size_offset(func, &symbolsize, &offset))
@@ -332,7 +335,6 @@ replace_stack_entry(unsigned long stack_entries[], int num_entries, unsigned lon
                func -= offset;

                if (func == target_func) {
-                       *replaced = stack_entries[skip];
                        stack_entries[skip] = ip;
                        return skip;
                }

> The implementation is a little trickier of course, but I really wanted
> the main stack trace to look like any other non-reordered access,
> which starts from the original access, and only have the "reordered
> to" location be secondary information.
> 
> The foundation for doing this this was put in place here:
> https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=6c65eb75686f
> 
> Thanks,
> -- Marco