From: Nadav Amit <namit@vmware.com>
To: Minchan Kim <minchan@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>,
Ingo Molnar <mingo@kernel.org>,
Stephen Rothwell <sfr@canb.auug.org.au>,
Andrew Morton <akpm@linux-foundation.org>,
Thomas Gleixner <tglx@linutronix.de>,
"Ingo Molnar" <mingo@elte.hu>, "H. Peter Anvin" <hpa@zytor.com>,
"Linux-Next Mailing List" <linux-next@vger.kernel.org>,
Linux Kernel Mailing List <linux-kernel@vger.kernel.org>,
Linus <torvalds@linux-foundation.org>
Subject: Re: linux-next: manual merge of the akpm-current tree with the tip tree
Date: Mon, 14 Aug 2017 05:07:19 +0000 [thread overview]
Message-ID: <0F858068-D41D-46E3-B4A8-8A95B4EDB94F@vmware.com> (raw)
In-Reply-To: <20170814031613.GD25427@bbox>
Minchan Kim <minchan@kernel.org> wrote:
> On Sun, Aug 13, 2017 at 02:50:19PM +0200, Peter Zijlstra wrote:
>> On Sun, Aug 13, 2017 at 06:06:32AM +0000, Nadav Amit wrote:
>>>> however mm_tlb_flush_nested() is a mystery, it appears to care about
>>>> anything inside the range. For now rely on it doing at least _a_ PTL
>>>> lock instead of taking _the_ PTL lock.
>>>
>>> It does not care about “anything” inside the range, but only on situations
>>> in which there is at least one (same) PT that was modified by one core and
>>> then read by the other. So, yes, it will always be _the_ same PTL, and not
>>> _a_ PTL - in the cases that flush is really needed.
>>>
>>> The issue that might require additional barriers is that
>>> inc_tlb_flush_pending() and mm_tlb_flush_nested() are called when the PTL is
>>> not held. IIUC, since the release-acquire might not behave as a full memory
>>> barrier, this requires an explicit memory barrier.
>>
>> So I'm not entirely clear about this yet.
>>
>> How about:
>>
>>
>> CPU0 CPU1
>>
>> tlb_gather_mmu()
>>
>> lock PTLn
>> no mod
>> unlock PTLn
>>
>> tlb_gather_mmu()
>>
>> lock PTLm
>> mod
>> include in tlb range
>> unlock PTLm
>>
>> lock PTLn
>> mod
>> unlock PTLn
>>
>> tlb_finish_mmu()
>> force = mm_tlb_flush_nested(tlb->mm);
>> arch_tlb_finish_mmu(force);
>>
>>
>> ... more ...
>>
>> tlb_finish_mmu()
>>
>>
>>
>> In this case you also want CPU1's mm_tlb_flush_nested() call to return
>> true, right?
>
> No, because CPU 1 mofified pte and added it into tlb range
> so regardless of nested, it will flush TLB so there is no stale
> TLB problem.
>
>> But even with an smp_mb__after_atomic() at CPU0's tlg_bather_mmu()
>> you're not guaranteed CPU1 sees the increment. The only way to do that
>> is to make the PTL locks RCsc and that is a much more expensive
>> proposition.
>>
>>
>> What about:
>>
>>
>> CPU0 CPU1
>>
>> tlb_gather_mmu()
>>
>> lock PTLn
>> no mod
>> unlock PTLn
>>
>>
>> lock PTLm
>> mod
>> include in tlb range
>> unlock PTLm
>>
>> tlb_gather_mmu()
>>
>> lock PTLn
>> mod
>> unlock PTLn
>>
>> tlb_finish_mmu()
>> force = mm_tlb_flush_nested(tlb->mm);
>> arch_tlb_finish_mmu(force);
>>
>>
>> ... more ...
>>
>> tlb_finish_mmu()
>>
>> Do we want CPU1 to see it here? If so, where does it end?
>
> Ditto. Since CPU 1 has added range, it will flush TLB regardless
> of nested condition.
>
>> CPU0 CPU1
>>
>> tlb_gather_mmu()
>>
>> lock PTLn
>> no mod
>> unlock PTLn
>>
>>
>> lock PTLm
>> mod
>> include in tlb range
>> unlock PTLm
>>
>> tlb_finish_mmu()
>> force = mm_tlb_flush_nested(tlb->mm);
>>
>> tlb_gather_mmu()
>>
>> lock PTLn
>> mod
>> unlock PTLn
>>
>> arch_tlb_finish_mmu(force);
>>
>>
>> ... more ...
>>
>> tlb_finish_mmu()
>>
>>
>> This?
>>
>>
>> Could you clarify under what exact condition mm_tlb_flush_nested() must
>> return true?
>
> mm_tlb_flush_nested aims for the CPU side where there is no pte update
> but need TLB flush.
> As I wrote https://urldefense.proofpoint.com/v2/url?u=https-3A__marc.info_-3Fl-3Dlinux-2Dmm-26m-3D150267398226529-26w-3D2&d=DwIDaQ&c=uilaK90D4TOVoH58JNXRgQ&r=x9zhXCtCLvTDtvE65-BGSA&m=v2Z7eDi7z1H9zdngcjZvlNeBudWzA9KvcXFNpU2A77s&s=amaSu_gurmBHHPcl3Pxfdl0Tk_uTnmf60tMQAsNDHVU&e= ,
> it has stable TLB problem if we don't flush TLB although there is no
> pte modification.
To clarify: the main problem that these patches address is when the first
CPU updates the PTE, and second CPU sees the updated value and thinks: “the
PTE is already what I wanted - no flush is needed”.
For some reason (I would assume intentional), all the examples here first
“do not modify” the PTE, and then modify it - which is not an “interesting”
case. However, based on what I understand on the memory barriers, I think
there is indeed a missing barrier before reading it in
mm_tlb_flush_nested(). IIUC using smp_mb__after_unlock_lock() in this case,
before reading, would solve the problem with least impact on systems with
strong memory ordering.
Minchan, as for the solution you proposed, it seems to open again a race,
since the “pending” indication is removed before the actual TLB flush is
performed.
Nadav
next prev parent reply other threads:[~2017-08-14 5:07 UTC|newest]
Thread overview: 112+ messages / expand[flat|nested] mbox.gz Atom feed top
2017-08-11 7:53 linux-next: manual merge of the akpm-current tree with the tip tree Stephen Rothwell
2017-08-11 9:34 ` Peter Zijlstra
2017-08-11 10:48 ` Peter Zijlstra
2017-08-11 11:45 ` Stephen Rothwell
2017-08-11 11:56 ` Ingo Molnar
2017-08-11 12:17 ` Peter Zijlstra
2017-08-11 12:44 ` Ingo Molnar
2017-08-11 13:49 ` Stephen Rothwell
2017-08-11 14:04 ` Peter Zijlstra
2017-08-13 6:06 ` Nadav Amit
2017-08-13 12:50 ` Peter Zijlstra
2017-08-14 3:16 ` Minchan Kim
2017-08-14 5:07 ` Nadav Amit [this message]
2017-08-14 5:23 ` Minchan Kim
2017-08-14 8:38 ` Minchan Kim
2017-08-14 19:57 ` Peter Zijlstra
2017-08-16 4:14 ` Minchan Kim
2017-08-14 19:38 ` Peter Zijlstra
2017-08-15 7:51 ` Nadav Amit
2017-08-14 3:09 ` Minchan Kim
2017-08-14 18:54 ` Peter Zijlstra
-- strict thread matches above, loose matches on Subject: below --
2022-02-16 5:38 Stephen Rothwell
2021-10-07 6:27 Stephen Rothwell
2021-03-22 6:12 Stephen Rothwell
2020-12-11 8:56 Stephen Rothwell
2020-12-11 12:47 ` Jason Gunthorpe
2020-11-27 7:48 Stephen Rothwell
2020-11-27 7:39 Stephen Rothwell
2020-11-27 11:54 ` Andy Shevchenko
2020-11-30 9:27 ` Thomas Gleixner
2020-11-23 8:05 Stephen Rothwell
2020-11-09 6:00 Stephen Rothwell
2020-10-13 6:59 Stephen Rothwell
2020-07-17 10:19 Stephen Rothwell
2020-05-29 11:05 Stephen Rothwell
2020-05-29 10:18 Stephen Rothwell
2020-05-29 10:05 Stephen Rothwell
2020-05-29 9:58 Stephen Rothwell
2020-05-25 11:04 Stephen Rothwell
2020-05-26 4:41 ` Singh, Balbir
2020-06-03 4:43 ` Stephen Rothwell
2020-05-19 16:18 Stephen Rothwell
2020-03-25 7:48 Stephen Rothwell
2020-03-19 6:42 Stephen Rothwell
2020-01-20 6:37 Stephen Rothwell
2020-01-20 6:30 Stephen Rothwell
2019-10-31 5:43 Stephen Rothwell
2019-06-24 10:24 Stephen Rothwell
2019-05-01 11:10 Stephen Rothwell
2019-01-31 4:31 Stephen Rothwell
2018-08-20 4:32 Stephen Rothwell
2018-08-20 19:52 ` Andrew Morton
2018-03-23 5:59 Stephen Rothwell
2017-12-18 5:04 Stephen Rothwell
2017-11-10 4:33 Stephen Rothwell
2017-11-02 7:19 Stephen Rothwell
2017-08-22 6:57 Stephen Rothwell
2017-08-23 6:39 ` Vlastimil Babka
2017-04-12 6:46 Stephen Rothwell
2017-04-12 20:53 ` Vlastimil Babka
2017-04-20 2:17 ` NeilBrown
2017-03-24 5:25 Stephen Rothwell
2017-02-17 4:40 Stephen Rothwell
2016-11-14 6:08 Stephen Rothwell
2016-07-29 4:14 Stephen Rothwell
2016-06-15 5:23 Stephen Rothwell
2016-06-18 19:39 ` Manfred Spraul
2016-04-29 6:12 Stephen Rothwell
2016-04-29 6:26 ` Ingo Molnar
2016-03-02 5:40 Stephen Rothwell
2016-02-26 5:07 Stephen Rothwell
2016-02-26 21:35 ` Andrew Morton
2016-02-19 4:09 Stephen Rothwell
2016-02-19 15:26 ` Ard Biesheuvel
2015-12-07 8:06 Stephen Rothwell
2015-10-02 4:21 Stephen Rothwell
2015-07-28 6:00 Stephen Rothwell
2015-07-29 17:12 ` Andrea Arcangeli
2015-07-29 17:47 ` Andy Lutomirski
2015-07-29 18:46 ` Thomas Gleixner
2015-07-30 15:38 ` Andrea Arcangeli
2015-07-29 23:06 ` Stephen Rothwell
2015-07-29 23:07 ` Thomas Gleixner
2015-09-07 23:35 ` Stephen Rothwell
2015-09-08 18:11 ` Linus Torvalds
2015-09-08 22:56 ` Stephen Rothwell
2015-09-08 23:03 ` Linus Torvalds
2015-09-08 23:21 ` Andrew Morton
2015-09-16 6:58 ` Geert Uytterhoeven
2015-06-04 12:07 Stephen Rothwell
2015-04-08 8:28 Stephen Rothwell
2015-04-08 8:25 Stephen Rothwell
2014-03-17 9:31 Stephen Rothwell
2014-03-17 9:36 ` Peter Zijlstra
2014-03-19 23:27 ` Andrew Morton
2014-01-14 4:53 Stephen Rothwell
2014-01-14 5:04 ` Davidlohr Bueso
2014-01-14 12:51 ` Peter Zijlstra
2014-01-14 13:17 ` Geert Uytterhoeven
2014-01-14 13:33 ` Peter Zijlstra
2014-01-14 16:19 ` H. Peter Anvin
2014-01-14 15:15 ` H. Peter Anvin
2014-01-14 15:20 ` Geert Uytterhoeven
2014-01-14 15:41 ` Peter Zijlstra
2014-01-14 15:48 ` H. Peter Anvin
2014-01-07 6:00 Stephen Rothwell
2014-01-07 6:34 ` Tang Chen
2013-11-08 7:48 Stephen Rothwell
2013-11-08 18:58 ` Josh Triplett
2013-11-08 23:20 ` Stephen Rothwell
2013-11-09 0:19 ` Josh Triplett
2013-10-30 6:40 Stephen Rothwell
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=0F858068-D41D-46E3-B4A8-8A95B4EDB94F@vmware.com \
--to=namit@vmware.com \
--cc=akpm@linux-foundation.org \
--cc=hpa@zytor.com \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-next@vger.kernel.org \
--cc=minchan@kernel.org \
--cc=mingo@elte.hu \
--cc=mingo@kernel.org \
--cc=peterz@infradead.org \
--cc=sfr@canb.auug.org.au \
--cc=tglx@linutronix.de \
--cc=torvalds@linux-foundation.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).