All of lore.kernel.org
 help / color / mirror / Atom feed
From: Martin Schwidefsky <schwidefsky@de.ibm.com>
To: Catalin Marinas <catalin.marinas@arm.com>
Cc: Peter Zijlstra <peterz@infradead.org>,
	Ingo Molnar <mingo@redhat.com>,
	"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>
Subject: Re: [PATCH 1/2] sched/mm: add finish_switch_mm function
Date: Wed, 13 Nov 2013 17:05:56 +0100	[thread overview]
Message-ID: <20131113170556.7e170e89@mschwide> (raw)
In-Reply-To: <20131113121909.GA18837@arm.com>

On Wed, 13 Nov 2013 12:19:09 +0000
Catalin Marinas <catalin.marinas@arm.com> wrote:

> On Wed, Nov 13, 2013 at 11:41:43AM +0000, Peter Zijlstra wrote:
> > On Wed, Nov 13, 2013 at 09:16:13AM +0100, Martin Schwidefsky wrote:
> > > The switch_mm function is called with the task_lock and/or with
> > > request queue lock. Add finish_switch_mm to allow an architecture
> > > to execute some code after the mm has been switched but without
> > > any locks held. One use case is the s390 architecture which will
> > > use this to wait for the completion of TLB flush operations.
> 
> We have similar needs on arm and arm64 (full cache flushing where we
> want interrupts enable or some IPIs for TLB tagging synchronisation).

On s390 we need to wait for the completion of a TLB flush.

> > > diff --git a/kernel/sched/core.c b/kernel/sched/core.c
> > > index 1deccd7..89409cb 100644
> > > --- a/kernel/sched/core.c
> > > +++ b/kernel/sched/core.c
> > > @@ -32,7 +32,7 @@
> > >  #include <linux/init.h>
> > >  #include <linux/uaccess.h>
> > >  #include <linux/highmem.h>
> > > -#include <asm/mmu_context.h>
> > > +#include <linux/mmu_context.h>
> > >  #include <linux/interrupt.h>
> > >  #include <linux/capability.h>
> > >  #include <linux/completion.h>
> > > @@ -1996,6 +1996,7 @@ static void finish_task_switch(struct rq *rq, struct task_struct *prev)
> > >  	perf_event_task_sched_in(prev, current);
> > >  	finish_lock_switch(rq, prev);
> > >  	finish_arch_post_lock_switch();
> > > +	finish_switch_mm(current->mm, current);
> 
> This could use the same hook.

Yes.
 
> > >  
> > >  	fire_sched_in_preempt_notifiers(current);
> > >  	if (mm)
> > > @@ -4140,8 +4141,10 @@ void idle_task_exit(void)
> > >  
> > >  	BUG_ON(cpu_online(smp_processor_id()));
> > >  
> > > -	if (mm != &init_mm)
> > > +	if (mm != &init_mm) {
> > >  		switch_mm(mm, &init_mm, current);
> > > +		finish_switch_mm(&init_mm, current);
> > > +	}
> > >  	mmdrop(mm);
> > >  }
> 
> Here finish_switch_mm() is called in the same context with switch_mm().
> What we have on ARM via switch_mm() is to check for irqs_disabled() and
> if yes, defer the actual switching via a flag until the
> finish_arch_post_lock_switch() hook. But on ARM we only cared about the
> interrupts being enabled.

The guarantee s390 needs is that the rq-lock is not taken. What I have
seen with the wait loop in switch_mm is a dead lock because one CPU #0
was looping in switch_mm to wait for the TLB flush of another CPU #1.
CPU #1 got an interrupt that tried to wake-up a task which happened to
be on the run-queue of CPU #0.

> > > diff --git a/mm/mmu_context.c b/mm/mmu_context.c
> > > index 8a8cd02..11b3d47 100644
> > > --- a/mm/mmu_context.c
> > > +++ b/mm/mmu_context.c
> > > @@ -8,8 +8,6 @@
> > >  #include <linux/export.h>
> > >  #include <linux/sched.h>
> > >  
> > > -#include <asm/mmu_context.h>
> > > -
> > >  /*
> > >   * use_mm
> > >   *	Makes the calling kernel thread take on the specified
> > > @@ -31,6 +29,7 @@ void use_mm(struct mm_struct *mm)
> > >  	tsk->mm = mm;
> > >  	switch_mm(active_mm, mm, tsk);
> > >  	task_unlock(tsk);
> > > +	finish_switch_mm(mm, tsk);
> 
> As above, for ARM we only care about interrupts being enabled, so it
> didn't require a hook.
> 
> Is s390 switch_mm() ok with only interrupts being enabled but some locks
> held?

Interrupts on/off is not the problem for s390. 

-- 
blue skies,
   Martin.

"Reality continues to ruin my life." - Calvin.


  reply	other threads:[~2013-11-13 16:07 UTC|newest]

Thread overview: 21+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2013-11-13  8:16 [PATCH 0/2] sched: finish_switch_mm hook Martin Schwidefsky
2013-11-13  8:16 ` [PATCH 1/2] sched/mm: add finish_switch_mm function Martin Schwidefsky
2013-11-13 11:41   ` Peter Zijlstra
2013-11-13 11:49     ` Martin Schwidefsky
2013-11-13 12:19     ` Catalin Marinas
2013-11-13 16:05       ` Martin Schwidefsky [this message]
2013-11-13 17:03         ` Catalin Marinas
2013-11-14  8:00           ` Martin Schwidefsky
2013-11-13  8:16 ` [PATCH 2/2] s390/mm,tlb: race of lazy TLB flush vs. recreation of TLB entries Martin Schwidefsky
2013-11-13 16:16   ` Catalin Marinas
2013-11-14  8:10     ` Martin Schwidefsky
2013-11-14 13:22       ` Catalin Marinas
2013-11-14 16:33         ` Martin Schwidefsky
2013-11-15 10:44           ` Catalin Marinas
2013-11-15 11:10             ` Martin Schwidefsky
2013-11-15 11:17               ` Martin Schwidefsky
2013-11-15 11:57                 ` Catalin Marinas
2013-11-15 13:29                   ` Martin Schwidefsky
2013-11-15 13:46                     ` Catalin Marinas
2013-11-18  8:11                       ` Martin Schwidefsky
2013-11-15  9:13       ` Martin Schwidefsky

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20131113170556.7e170e89@mschwide \
    --to=schwidefsky@de.ibm.com \
    --cc=catalin.marinas@arm.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mingo@redhat.com \
    --cc=peterz@infradead.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.