linuxppc-dev.lists.ozlabs.org archive mirror
 help / color / mirror / Atom feed
From: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
To: Nadav Amit <nadav.amit@gmail.com>, Peter Zijlstra <peterz@infradead.org>
Cc: linux-arch <linux-arch@vger.kernel.org>,
	Arnd Bergmann <arnd@arndb.de>, riel <riel@surriel.com>,
	Will Deacon <will@kernel.org>, x86 <x86@kernel.org>,
	Dave Hansen <dave.hansen@intel.com>,
	linux-kernel <linux-kernel@vger.kernel.org>,
	Nicholas Piggin <npiggin@gmail.com>,
	linux-mm <linux-mm@kvack.org>, Andy Lutomirski <luto@kernel.org>,
	Catalin Marinas <catalin.marinas@arm.com>,
	Jann Horn <jannh@google.com>,
	linuxppc-dev <linuxppc-dev@lists.ozlabs.org>
Subject: Re: [RFC v2 1/2] [NEEDS HELP] x86/mm: Handle unlazying membarrier core sync in the arch code
Date: Fri, 4 Dec 2020 15:39:58 -0500 (EST)	[thread overview]
Message-ID: <77346515.6733.1607114398867.JavaMail.zimbra@efficios.com> (raw)
In-Reply-To: <A61977A7-F0B2-4492-AB6D-06E24417FA59@gmail.com>

----- On Dec 4, 2020, at 3:17 AM, Nadav Amit nadav.amit@gmail.com wrote:

> I am not very familiar with membarrier, but here are my 2 cents while trying
> to answer your questions.
> 
>> On Dec 3, 2020, at 9:26 PM, Andy Lutomirski <luto@kernel.org> wrote:
>> @@ -496,6 +497,8 @@ void switch_mm_irqs_off(struct mm_struct *prev, struct
>> mm_struct *next,
>> 		 * from one thread in a process to another thread in the same
>> 		 * process. No TLB flush required.
>> 		 */
>> +
>> +		// XXX: why is this okay wrt membarrier?
>> 		if (!was_lazy)
>> 			return;
> 
> I am confused.
> 
> On one hand, it seems that membarrier_private_expedited() would issue an IPI
> to that core, as it would find that this core’s cpu_rq(cpu)->curr->mm is the
> same as the one that the membarrier applies to.

If the scheduler switches from one thread to another which both have the same mm,
it means cpu_rq(cpu)->curr->mm is invariant, even though ->curr changes. So there
is no need to issue a memory barrier or sync core for membarrier in this case,
because there is no way the IPI can be missed.

> But… (see below)
> 
> 
>> @@ -508,12 +511,24 @@ void switch_mm_irqs_off(struct mm_struct *prev, struct
>> mm_struct *next,
>> 		smp_mb();
>> 		next_tlb_gen = atomic64_read(&next->context.tlb_gen);
>> 		if (this_cpu_read(cpu_tlbstate.ctxs[prev_asid].tlb_gen) ==
>> -				next_tlb_gen)
>> +		    next_tlb_gen) {
>> +			/*
>> +			 * We're reactivating an mm, and membarrier might
>> +			 * need to serialize.  Tell membarrier.
>> +			 */
>> +
>> +			// XXX: I can't understand the logic in
>> +			// membarrier_mm_sync_core_before_usermode().  What's
>> +			// the mm check for?
>> +			membarrier_mm_sync_core_before_usermode(next);
> 
> On the other hand the reason for this mm check that you mention contradicts
> my previous understanding as the git log says:
> 
> commit 2840cf02fae627860156737e83326df354ee4ec6
> Author: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
> Date:   Thu Sep 19 13:37:01 2019 -0400
> 
>    sched/membarrier: Call sync_core only before usermode for same mm
>    
>    When the prev and next task's mm change, switch_mm() provides the core
>    serializing guarantees before returning to usermode. The only case
>    where an explicit core serialization is needed is when the scheduler
>    keeps the same mm for prev and next.

Hrm, so your point here is that if the scheduler keeps the same mm for
prev and next, it means membarrier will have observed the same rq->curr->mm,
and therefore the IPI won't be missed. I wonder if that
membarrier_mm_sync_core_before_usermode is needed at all then or if we
have just been too careful and did not consider that all the scenarios which
need to be core-sync'd are indeed taken care of ?

I see here that my prior commit message indeed discusses prev and next task's
mm, but in reality, we are comparing current->mm with rq->prev_mm. So from
a lazy TLB perspective, this probably matters, and we may still need a core sync
in some lazy TLB scenarios.

> 
>> 	/*
>> 	 * When switching through a kernel thread, the loop in
>> 	 * membarrier_{private,global}_expedited() may have observed that
>> 	 * kernel thread and not issued an IPI. It is therefore possible to
>> 	 * schedule between user->kernel->user threads without passing though
>> 	 * switch_mm(). Membarrier requires a barrier after storing to
>> -	 * rq->curr, before returning to userspace, so provide them here:
>> +	 * rq->curr, before returning to userspace, and mmdrop() provides
>> +	 * this barrier.
>> 	 *
>> -	 * - a full memory barrier for {PRIVATE,GLOBAL}_EXPEDITED, implicitly
>> -	 *   provided by mmdrop(),
>> -	 * - a sync_core for SYNC_CORE.
>> +	 * XXX: I don't think mmdrop() actually does this.  There's no
>> +	 * smp_mb__before/after_atomic() in there.
> 
> I presume that since x86 is the only one that needs
> membarrier_mm_sync_core_before_usermode(), nobody noticed the missing
> smp_mb__before/after_atomic(). These are anyhow a compiler barrier in x86,
> and such a barrier would take place before the return to userspace.

mmdrop already provides the memory barriers for membarrer, as I documented within the
function.

Thanks,

Mathieu

-- 
Mathieu Desnoyers
EfficiOS Inc.
http://www.efficios.com

  reply	other threads:[~2020-12-04 20:42 UTC|newest]

Thread overview: 10+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-12-04  5:26 [RFC v2 0/2] lazy mm refcounting Andy Lutomirski
2020-12-04  5:26 ` [RFC v2 1/2] [NEEDS HELP] x86/mm: Handle unlazying membarrier core sync in the arch code Andy Lutomirski
2020-12-04  7:06   ` Nicholas Piggin
2020-12-04  8:17   ` Nadav Amit
2020-12-04 20:39     ` Mathieu Desnoyers [this message]
2020-12-04 20:24   ` Mathieu Desnoyers
2020-12-04  5:26 ` [RFC v2 2/2] [MOCKUP] sched/mm: Lightweight lazy mm refcounting Andy Lutomirski
2020-12-04  7:54   ` Nicholas Piggin
2020-12-04 14:37     ` Andy Lutomirski
2020-12-05  4:49       ` Nicholas Piggin

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=77346515.6733.1607114398867.JavaMail.zimbra@efficios.com \
    --to=mathieu.desnoyers@efficios.com \
    --cc=arnd@arndb.de \
    --cc=catalin.marinas@arm.com \
    --cc=dave.hansen@intel.com \
    --cc=jannh@google.com \
    --cc=linux-arch@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=linuxppc-dev@lists.ozlabs.org \
    --cc=luto@kernel.org \
    --cc=nadav.amit@gmail.com \
    --cc=npiggin@gmail.com \
    --cc=peterz@infradead.org \
    --cc=riel@surriel.com \
    --cc=will@kernel.org \
    --cc=x86@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).