linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* can't flush tlb on e500
@ 2009-05-20 10:12 Saito Hideo
  2009-05-22  0:57 ` Hideo Saito
  2009-05-22  9:27 ` Benjamin Herrenschmidt
  0 siblings, 2 replies; 5+ messages in thread
From: Saito Hideo @ 2009-05-20 10:12 UTC (permalink / raw)
  To: linux-kernel

It seems that the code for powerpc with e500 on linux-2.6.29 has an
regression, because I can't execute hackbench on my platform. I
apologize if already reported.

For example, following message was asserted, however the command is
done successfully if the number given to hackbench is small.

% ./hackbench 20
SERVER: read (error: Bad file descriptor)

% ./hackbench 4
%

I think that the tlb should be cleared before mm->context.id is set
MMU_NO_CONTEXT.

--- arch/powerpc/mm/mmu_context_nohash.c.orig	2009-03-24
08:12:14.000000000 +0900
+++ arch/powerpc/mm/mmu_context_nohash.c	2009-05-20 18:33:53.000000000 +0900
@@ -122,22 +122,22 @@ static unsigned int steal_context_up(uns
 	struct mm_struct *mm;
 	int cpu = smp_processor_id();

 	/* Pick up the victim mm */
 	mm = context_mm[id];

 	pr_debug("[%d] steal context %d from mm @%p\n", cpu, id, mm);

-	/* Mark this mm has having no context anymore */
-	mm->context.id = MMU_NO_CONTEXT;
-
 	/* Flush the TLB for that context */
 	local_flush_tlb_mm(mm);

+	/* Mark this mm has having no context anymore */
+	mm->context.id = MMU_NO_CONTEXT;
+
 	/* XXX This clear should ultimately be part of local_flush_tlb_mm */
 	__clear_bit(id, stale_map[cpu]);

 	return id;
 }

 #ifdef DEBUG_MAP_CONSISTENCY
 static void context_check_map(void)

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: can't flush tlb on e500
  2009-05-20 10:12 can't flush tlb on e500 Saito Hideo
@ 2009-05-22  0:57 ` Hideo Saito
  2009-05-22  9:20   ` Benjamin Herrenschmidt
  2009-05-22  9:27 ` Benjamin Herrenschmidt
  1 sibling, 1 reply; 5+ messages in thread
From: Hideo Saito @ 2009-05-22  0:57 UTC (permalink / raw)
  To: linux-kernel; +Cc: benh, hsaito.ppc

How about following changes because all TLB entries are flushed repeatedly if processes overflow at the table mapping the context. I think that the table should be initialized again because _tlbil_pid() flushes all TLB entries and tlbilx instruction isn't supported on E500(MPC8548).

--- arch/powerpc/mm/mmu_context_nohash.c.orig	2009-03-24 08:12:14.000000000 +0900
+++ arch/powerpc/mm/mmu_context_nohash.c	2009-05-21 16:35:09.000000000 +0900
@@ -107,39 +107,69 @@ static unsigned int steal_context_smp(un
 	 */
 	spin_unlock(&context_lock);
 	cpu_relax();
 	spin_lock(&context_lock);
 	goto again;
 }
 #endif  /* CONFIG_SMP */
 
+/*
+ * We're flushed using the all context
+ */
+static void flush_all_context(int cpu)
+{
+	struct mm_struct *mm;
+	int n;
+
+	for (n = first_context; n <= last_context; n++) {
+
+		mm = context_mm[n];
+		if (mm == NULL || mm->context.id == MMU_NO_CONTEXT)
+			continue;
+
+		WARN_ON(mm->context.active != 0);
+
+		mm->context.id = MMU_NO_CONTEXT;
+	}
+	memset(stale_map[cpu], 0, CTX_MAP_SIZE);
+	memset(context_map, 0, CTX_MAP_SIZE);
+	context_map[0] = (1 << first_context) - 1;
+	nr_free_contexts = last_context - first_context + 1;
+}
+
 /* Note that this will also be called on SMP if all other CPUs are
  * offlined, which means that it may be called for cpu != 0. For
  * this to work, we somewhat assume that CPUs that are onlined
  * come up with a fully clean TLB (or are cleaned when offlined)
  */
 static unsigned int steal_context_up(unsigned int id)
 {
 	struct mm_struct *mm;
 	int cpu = smp_processor_id();
 
 	/* Pick up the victim mm */
 	mm = context_mm[id];
 
 	pr_debug("[%d] steal context %d from mm @%p\n", cpu, id, mm);
 
-	/* Mark this mm has having no context anymore */
-	mm->context.id = MMU_NO_CONTEXT;
-
 	/* Flush the TLB for that context */
 	local_flush_tlb_mm(mm);
 
+#ifdef CONFIG_FSL_BOOKE
+	flush_all_context(cpu);
+	__set_bit(id, context_map);
+	nr_free_contexts--;
+#else
+	/* Mark this mm has having no context anymore */
+	mm->context.id = MMU_NO_CONTEXT;
+
 	/* XXX This clear should ultimately be part of local_flush_tlb_mm */
 	__clear_bit(id, stale_map[cpu]);
+#endif
 
 	return id;
 }
 
 #ifdef DEBUG_MAP_CONSISTENCY
 static void context_check_map(void)
 {
 	unsigned int id, nrf, nact;

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: can't flush tlb on e500
  2009-05-22  0:57 ` Hideo Saito
@ 2009-05-22  9:20   ` Benjamin Herrenschmidt
  0 siblings, 0 replies; 5+ messages in thread
From: Benjamin Herrenschmidt @ 2009-05-22  9:20 UTC (permalink / raw)
  To: Hideo Saito; +Cc: linux-kernel

On Fri, 2009-05-22 at 09:57 +0900, Hideo Saito wrote:
> How about following changes because all TLB entries are flushed
> repeatedly if processes overflow at the table mapping the context.
> I think that the table should be initialized again because _tlbil_pid()
> flushes all TLB entries and tlbilx instruction isn't supported on E500(MPC8548).

Not in this version of the CPU but in others... so we need to be a bit
more careful.

It's true that the current code will mark one context as stale but will
end up flushing them all from the TLB. This does have the
disadvantage that subsequent context switch might want to proceed to
more stealing with flush despite the fact that the TLB is indeed empty.

Your approach seems a bit hackish in the implementation but makes
somewhat sense in effectively disconnecting all context numbers when the
TLB is flushed. It's however not necessarily useable as-is on SMP.

One thing I would try to do in the future for example is to have a
lockless scenario when a context is already assigned. Your approach
would cause, under pressure, contexts to be de-assigned a bit to
aggressively. There are other reasons also why I wouldn't be so eager to
recycle PIDs.

Maybe a better option is to keep separate maps, between assigned
contexts and "dirty" contexts. IE. The current map means assigned. A
context that is switch_mm'ed is marked dirty. We could maybe use that to
make smarter decisions as to when re-assign a PID and when use one that
we know has been flushed out and not re-dirty'ed yet.

Also, it would also make some sense to have flush_tlb_mm() clear the
dirty map in that case as well. In fact, there's a patch coming from
Dave Kleikamp that implements lazy flushing by effectively just
recycling PIDs for flush_tlb_mm() and only do a flush when running out.

Cheers,
Ben.

> --- arch/powerpc/mm/mmu_context_nohash.c.orig	2009-03-24 08:12:14.000000000 +0900
> +++ arch/powerpc/mm/mmu_context_nohash.c	2009-05-21 16:35:09.000000000 +0900
> @@ -107,39 +107,69 @@ static unsigned int steal_context_smp(un
>  	 */
>  	spin_unlock(&context_lock);
>  	cpu_relax();
>  	spin_lock(&context_lock);
>  	goto again;
>  }
>  #endif  /* CONFIG_SMP */
>  
> +/*
> + * We're flushed using the all context
> + */
> +static void flush_all_context(int cpu)
> +{
> +	struct mm_struct *mm;
> +	int n;
> +
> +	for (n = first_context; n <= last_context; n++) {
> +
> +		mm = context_mm[n];
> +		if (mm == NULL || mm->context.id == MMU_NO_CONTEXT)
> +			continue;
> +
> +		WARN_ON(mm->context.active != 0);
> +
> +		mm->context.id = MMU_NO_CONTEXT;
> +	}
> +	memset(stale_map[cpu], 0, CTX_MAP_SIZE);
> +	memset(context_map, 0, CTX_MAP_SIZE);
> +	context_map[0] = (1 << first_context) - 1;
> +	nr_free_contexts = last_context - first_context + 1;
> +}
> +
>  /* Note that this will also be called on SMP if all other CPUs are
>   * offlined, which means that it may be called for cpu != 0. For
>   * this to work, we somewhat assume that CPUs that are onlined
>   * come up with a fully clean TLB (or are cleaned when offlined)
>   */
>  static unsigned int steal_context_up(unsigned int id)
>  {
>  	struct mm_struct *mm;
>  	int cpu = smp_processor_id();
>  
>  	/* Pick up the victim mm */
>  	mm = context_mm[id];
>  
>  	pr_debug("[%d] steal context %d from mm @%p\n", cpu, id, mm);
>  
> -	/* Mark this mm has having no context anymore */
> -	mm->context.id = MMU_NO_CONTEXT;
> -
>  	/* Flush the TLB for that context */
>  	local_flush_tlb_mm(mm);
>  
> +#ifdef CONFIG_FSL_BOOKE
> +	flush_all_context(cpu);
> +	__set_bit(id, context_map);
> +	nr_free_contexts--;
> +#else
> +	/* Mark this mm has having no context anymore */
> +	mm->context.id = MMU_NO_CONTEXT;
> +
>  	/* XXX This clear should ultimately be part of local_flush_tlb_mm */
>  	__clear_bit(id, stale_map[cpu]);
> +#endif
>  
>  	return id;
>  }
>  
>  #ifdef DEBUG_MAP_CONSISTENCY
>  static void context_check_map(void)
>  {
>  	unsigned int id, nrf, nact;
> --
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at  http://www.tux.org/lkml/


^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: can't flush tlb on e500
  2009-05-20 10:12 can't flush tlb on e500 Saito Hideo
  2009-05-22  0:57 ` Hideo Saito
@ 2009-05-22  9:27 ` Benjamin Herrenschmidt
  2009-05-22  9:42   ` Benjamin Herrenschmidt
  1 sibling, 1 reply; 5+ messages in thread
From: Benjamin Herrenschmidt @ 2009-05-22  9:27 UTC (permalink / raw)
  To: Saito Hideo; +Cc: linux-kernel, linuxppc-dev list, Kumar Gala

On Wed, 2009-05-20 at 19:12 +0900, Saito Hideo wrote:

> I think that the tlb should be cleared before mm->context.id is set
> MMU_NO_CONTEXT.

You are right, this definitely looks like a bug on platforms that have
HW support for the tlbil instruction (and thus care about the PID for
flushing) which afaik is only the case of recent freescale chips.

Have you verified that this change fixes your problem ?

Can you re-submit to linuxppc-dev@ozlabs.org mailing list, along with
proper changeset comment and signed-off-by: line ?

Cheers,
Ben.

> --- arch/powerpc/mm/mmu_context_nohash.c.orig	2009-03-24
> 08:12:14.000000000 +0900
> +++ arch/powerpc/mm/mmu_context_nohash.c	2009-05-20 18:33:53.000000000 +0900
> @@ -122,22 +122,22 @@ static unsigned int steal_context_up(uns
>  	struct mm_struct *mm;
>  	int cpu = smp_processor_id();
> 
>  	/* Pick up the victim mm */
>  	mm = context_mm[id];
> 
>  	pr_debug("[%d] steal context %d from mm @%p\n", cpu, id, mm);
> 
> -	/* Mark this mm has having no context anymore */
> -	mm->context.id = MMU_NO_CONTEXT;
> -
>  	/* Flush the TLB for that context */
>  	local_flush_tlb_mm(mm);
> 
> +	/* Mark this mm has having no context anymore */
> +	mm->context.id = MMU_NO_CONTEXT;
> +
>  	/* XXX This clear should ultimately be part of local_flush_tlb_mm */
>  	__clear_bit(id, stale_map[cpu]);
> 
>  	return id;
>  }
> 
>  #ifdef DEBUG_MAP_CONSISTENCY
>  static void context_check_map(void)
> --
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at  http://www.tux.org/lkml/


^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: can't flush tlb on e500
  2009-05-22  9:27 ` Benjamin Herrenschmidt
@ 2009-05-22  9:42   ` Benjamin Herrenschmidt
  0 siblings, 0 replies; 5+ messages in thread
From: Benjamin Herrenschmidt @ 2009-05-22  9:42 UTC (permalink / raw)
  To: Saito Hideo; +Cc: linuxppc-dev list, linux-kernel, Kumar Gala

On Fri, 2009-05-22 at 19:27 +1000, Benjamin Herrenschmidt wrote:
> On Wed, 2009-05-20 at 19:12 +0900, Saito Hideo wrote:
> 
> > I think that the tlb should be cleared before mm->context.id is set
> > MMU_NO_CONTEXT.
> 
> You are right, this definitely looks like a bug on platforms that have
> HW support for the tlbil instruction (and thus care about the PID for
> flushing) which afaik is only the case of recent freescale chips.

In fact, you are doubly right in that it also happens on other platforms
because local_flush_tlb_mm() will check if the PID is MMU_NO_CONTEXT
regardless of what tlbilx supports.. oops

Looks like I only ran my context torture test with CONFIG_SMP enabled.
Shame on me.

> Have you verified that this change fixes your problem ?
> 
> Can you re-submit to linuxppc-dev@ozlabs.org mailing list, along with
> proper changeset comment and signed-off-by: line ?
> 
> Cheers,
> Ben.
> 
> > --- arch/powerpc/mm/mmu_context_nohash.c.orig	2009-03-24
> > 08:12:14.000000000 +0900
> > +++ arch/powerpc/mm/mmu_context_nohash.c	2009-05-20 18:33:53.000000000 +0900
> > @@ -122,22 +122,22 @@ static unsigned int steal_context_up(uns
> >  	struct mm_struct *mm;
> >  	int cpu = smp_processor_id();
> > 
> >  	/* Pick up the victim mm */
> >  	mm = context_mm[id];
> > 
> >  	pr_debug("[%d] steal context %d from mm @%p\n", cpu, id, mm);
> > 
> > -	/* Mark this mm has having no context anymore */
> > -	mm->context.id = MMU_NO_CONTEXT;
> > -
> >  	/* Flush the TLB for that context */
> >  	local_flush_tlb_mm(mm);
> > 
> > +	/* Mark this mm has having no context anymore */
> > +	mm->context.id = MMU_NO_CONTEXT;
> > +
> >  	/* XXX This clear should ultimately be part of local_flush_tlb_mm */
> >  	__clear_bit(id, stale_map[cpu]);
> > 
> >  	return id;
> >  }
> > 
> >  #ifdef DEBUG_MAP_CONSISTENCY
> >  static void context_check_map(void)
> > --
> > To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> > the body of a message to majordomo@vger.kernel.org
> > More majordomo info at  http://vger.kernel.org/majordomo-info.html
> > Please read the FAQ at  http://www.tux.org/lkml/
> 
> _______________________________________________
> Linuxppc-dev mailing list
> Linuxppc-dev@ozlabs.org
> https://ozlabs.org/mailman/listinfo/linuxppc-dev


^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2009-05-22  9:42 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2009-05-20 10:12 can't flush tlb on e500 Saito Hideo
2009-05-22  0:57 ` Hideo Saito
2009-05-22  9:20   ` Benjamin Herrenschmidt
2009-05-22  9:27 ` Benjamin Herrenschmidt
2009-05-22  9:42   ` Benjamin Herrenschmidt

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).