linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH] SN2 user-MMIO CPU migration
@ 2006-01-20  0:06 Brent Casavant
  2006-01-20  2:18 ` Jesse Barnes
  2006-01-20  8:36 ` Ingo Molnar
  0 siblings, 2 replies; 10+ messages in thread
From: Brent Casavant @ 2006-01-20  0:06 UTC (permalink / raw)
  To: linux-ia64, linux-kernel; +Cc: jes, tony.luck

On SGI Altix, MMIO writes from separate CPUs to the same device are
not guaranteed to arrive at the device in order.  Tasks cannot migrate
between CPUs during kernel-context execution (e.g. in device driver
code), so all is well.

However, if a device aperture is memory-mapped into a user process
which performs writes directly to the device (e.g. DRM), then it is
possible for the following incorrect sequence of events to occur:

	Task on CPU A issues write "A" to device
	Task migrates to CPU B
	Task on CPU B issues write "B" to device
	Device receives write "B"
	Device receives write "A"

This patch introduces a new machvec to the IA64 code to be called
at the time of task migration.  This machvec is a no-op on non-SN2
machines, and waits for a Shub status register to indicate outstanding
IO writes have been accepted on SN2.

A hook has also been added to the scheduler task migration code to
invoke this behavior at appropriate points.  This hook is compiled
into nothingness on all configs other than the IA64 generic and SN2
kernels.

Signed-off-by: Brent Casavant <bcasavan@sgi.com>

 arch/ia64/sn/kernel/setup.c       |    4 ++--
 arch/ia64/sn/kernel/sn2/sn2_smp.c |   19 ++++++++++++++++++-
 include/asm-ia64/machvec.h        |   13 +++++++++++++
 include/asm-ia64/machvec_sn2.h    |    4 +++-
 include/asm-ia64/system.h         |    2 ++
 kernel/sched.c                    |    7 +++++++
 6 files changed, 45 insertions(+), 4 deletions(-)

diff --git a/arch/ia64/sn/kernel/setup.c b/arch/ia64/sn/kernel/setup.c
index e510dce..8e40a2c 100644
--- a/arch/ia64/sn/kernel/setup.c
+++ b/arch/ia64/sn/kernel/setup.c
@@ -3,7 +3,7 @@
  * License.  See the file "COPYING" in the main directory of this archive
  * for more details.
  *
- * Copyright (C) 1999,2001-2005 Silicon Graphics, Inc. All rights reserved.
+ * Copyright (C) 1999,2001-2006 Silicon Graphics, Inc. All rights reserved.
  */
 
 #include <linux/config.h>
@@ -654,7 +654,7 @@ void __init sn_cpu_init(void)
 			SH2_PIO_WRITE_STATUS_1, SH2_PIO_WRITE_STATUS_3};
 		u64 *pio;
 		pio = is_shub1() ? pio1 : pio2;
-		pda->pio_write_status_addr = (volatile unsigned long *) LOCAL_MMR_ADDR(pio[slice]);
+		pda->pio_write_status_addr = (volatile unsigned long *) GLOBAL_MMR_ADDR(nasid, pio[slice]);
 		pda->pio_write_status_val = is_shub1() ? SH_PIO_WRITE_STATUS_PENDING_WRITE_COUNT_MASK : 0;
 	}
 
diff --git a/arch/ia64/sn/kernel/sn2/sn2_smp.c b/arch/ia64/sn/kernel/sn2/sn2_smp.c
index 471bbaa..647c20d 100644
--- a/arch/ia64/sn/kernel/sn2/sn2_smp.c
+++ b/arch/ia64/sn/kernel/sn2/sn2_smp.c
@@ -5,7 +5,7 @@
  * License.  See the file "COPYING" in the main directory of this archive
  * for more details.
  *
- * Copyright (C) 2000-2005 Silicon Graphics, Inc. All rights reserved.
+ * Copyright (C) 2000-2006 Silicon Graphics, Inc. All rights reserved.
  */
 
 #include <linux/init.h>
@@ -169,6 +169,23 @@ static inline unsigned long wait_piowc(v
 	return ws;
 }
 
+void sn_task_migrate(struct task_struct *task) {
+	pda_t *old_pda = pdacpu(task_cpu(task));
+	volatile unsigned long *adr = old_pda->pio_write_status_addr;
+	unsigned long val = old_pda->pio_write_status_val;
+
+	/* Ensure user-mapped MMIO writes from old CPU have been accepted
+	 * by the IO hardware before resuming execution on the new CPU.
+	 * This is necessary as on SN2 PIO write ordering is not guaranteed
+	 * when the writes issue from seperate CPUs (well, technically Shubs).
+	 * Without this it is possible for a task to issue write "A" from one
+	 * CPU, context switch to another CPU, issue write "B", but have
+	 * the IO hardware receive the writes in the order "B" then "A".
+	 */
+	while ((*adr & SH_PIO_WRITE_STATUS_PENDING_WRITE_COUNT_MASK) != val)
+		cpu_relax();
+}
+
 void sn_tlb_migrate_finish(struct mm_struct *mm)
 {
 	if (mm == current->mm)
diff --git a/include/asm-ia64/machvec.h b/include/asm-ia64/machvec.h
index ca5ea99..5fb312c 100644
--- a/include/asm-ia64/machvec.h
+++ b/include/asm-ia64/machvec.h
@@ -27,6 +27,7 @@ typedef void ia64_mv_irq_init_t (void);
 typedef void ia64_mv_send_ipi_t (int, int, int, int);
 typedef void ia64_mv_timer_interrupt_t (int, void *, struct pt_regs *);
 typedef void ia64_mv_global_tlb_purge_t (struct mm_struct *, unsigned long, unsigned long, unsigned long);
+typedef void ia64_mv_task_migrate_t (struct task_struct *);
 typedef void ia64_mv_tlb_migrate_finish_t (struct mm_struct *);
 typedef unsigned int ia64_mv_local_vector_to_irq (u8);
 typedef char *ia64_mv_pci_get_legacy_mem_t (struct pci_bus *);
@@ -85,10 +86,16 @@ machvec_noop_mm (struct mm_struct *mm)
 {
 }
 
+static inline void
+machvec_noop_task (struct task_struct *task)
+{
+}
+
 extern void machvec_setup (char **);
 extern void machvec_timer_interrupt (int, void *, struct pt_regs *);
 extern void machvec_dma_sync_single (struct device *, dma_addr_t, size_t, int);
 extern void machvec_dma_sync_sg (struct device *, struct scatterlist *, int, int);
+extern void machvec_task_migrate (struct task_struct *);
 extern void machvec_tlb_migrate_finish (struct mm_struct *);
 
 # if defined (CONFIG_IA64_HP_SIM)
@@ -113,6 +120,7 @@ extern void machvec_tlb_migrate_finish (
 #  define platform_send_ipi	ia64_mv.send_ipi
 #  define platform_timer_interrupt	ia64_mv.timer_interrupt
 #  define platform_global_tlb_purge	ia64_mv.global_tlb_purge
+#  define platform_task_migrate		ia64_mv.task_migrate
 #  define platform_tlb_migrate_finish	ia64_mv.tlb_migrate_finish
 #  define platform_dma_init		ia64_mv.dma_init
 #  define platform_dma_alloc_coherent	ia64_mv.dma_alloc_coherent
@@ -161,6 +169,7 @@ struct ia64_machine_vector {
 	ia64_mv_send_ipi_t *send_ipi;
 	ia64_mv_timer_interrupt_t *timer_interrupt;
 	ia64_mv_global_tlb_purge_t *global_tlb_purge;
+	ia64_mv_task_migrate_t *task_migrate;
 	ia64_mv_tlb_migrate_finish_t *tlb_migrate_finish;
 	ia64_mv_dma_init *dma_init;
 	ia64_mv_dma_alloc_coherent *dma_alloc_coherent;
@@ -205,6 +214,7 @@ struct ia64_machine_vector {
 	platform_send_ipi,			\
 	platform_timer_interrupt,		\
 	platform_global_tlb_purge,		\
+	platform_task_migrate,			\
 	platform_tlb_migrate_finish,		\
 	platform_dma_init,			\
 	platform_dma_alloc_coherent,		\
@@ -287,6 +297,9 @@ extern ia64_mv_dma_supported		swiotlb_dm
 #ifndef platform_global_tlb_purge
 # define platform_global_tlb_purge	ia64_global_tlb_purge /* default to architected version */
 #endif
+#ifndef platform_task_migrate
+# define platform_task_migrate		machvec_noop_task
+#endif
 #ifndef platform_tlb_migrate_finish
 # define platform_tlb_migrate_finish	machvec_noop_mm
 #endif
diff --git a/include/asm-ia64/machvec_sn2.h b/include/asm-ia64/machvec_sn2.h
index e1b6cd6..c91fd03 100644
--- a/include/asm-ia64/machvec_sn2.h
+++ b/include/asm-ia64/machvec_sn2.h
@@ -1,5 +1,5 @@
 /*
- * Copyright (c) 2002-2003 Silicon Graphics, Inc.  All Rights Reserved.
+ * Copyright (c) 2002-2006 Silicon Graphics, Inc.  All Rights Reserved.
  * 
  * This program is free software; you can redistribute it and/or modify it 
  * under the terms of version 2 of the GNU General Public License 
@@ -39,6 +39,7 @@ extern ia64_mv_irq_init_t sn_irq_init;
 extern ia64_mv_send_ipi_t sn2_send_IPI;
 extern ia64_mv_timer_interrupt_t sn_timer_interrupt;
 extern ia64_mv_global_tlb_purge_t sn2_global_tlb_purge;
+extern ia64_mv_task_migrate_t sn_task_migrate;
 extern ia64_mv_tlb_migrate_finish_t	sn_tlb_migrate_finish;
 extern ia64_mv_local_vector_to_irq sn_local_vector_to_irq;
 extern ia64_mv_pci_get_legacy_mem_t sn_pci_get_legacy_mem;
@@ -86,6 +87,7 @@ extern ia64_mv_dma_supported		sn_dma_sup
 #define platform_send_ipi		sn2_send_IPI
 #define platform_timer_interrupt	sn_timer_interrupt
 #define platform_global_tlb_purge       sn2_global_tlb_purge
+#define platform_task_migrate		sn_task_migrate
 #define platform_tlb_migrate_finish	sn_tlb_migrate_finish
 #define platform_pci_fixup		sn_pci_fixup
 #define platform_inb			__sn_inb
diff --git a/include/asm-ia64/system.h b/include/asm-ia64/system.h
index 80c5a23..47093c1 100644
--- a/include/asm-ia64/system.h
+++ b/include/asm-ia64/system.h
@@ -249,6 +249,8 @@ extern void ia64_load_extra (struct task
 # define switch_to(prev,next,last)	__switch_to(prev, next, last)
 #endif
 
+#define arch_task_migrate(task)	platform_task_migrate(task)
+
 /*
  * On IA-64, we don't want to hold the runqueue's lock during the low-level context-switch,
  * because that could cause a deadlock.  Here is an example by Erich Focht:
diff --git a/kernel/sched.c b/kernel/sched.c
index 788ecce..d8375c0 100644
--- a/kernel/sched.c
+++ b/kernel/sched.c
@@ -291,6 +291,9 @@ for (domain = rcu_dereference(cpu_rq(cpu
 #ifndef finish_arch_switch
 # define finish_arch_switch(prev)	do { } while (0)
 #endif
+#ifndef arch_task_migrate
+# define arch_task_migrate(task)	do { } while (0)
+#endif
 
 #ifndef __ARCH_WANT_UNLOCKED_CTXSW
 static inline int task_running(runqueue_t *rq, task_t *p)
@@ -936,6 +939,7 @@ static int migrate_task(task_t *p, int d
 	 * it is sufficient to simply update the task's cpu field.
 	 */
 	if (!p->array && !task_running(rq, p)) {
+		arch_task_migrate(p);
 		set_task_cpu(p, dest_cpu);
 		return 0;
 	}
@@ -1353,6 +1357,7 @@ static int try_to_wake_up(task_t *p, uns
 out_set_cpu:
 	new_cpu = wake_idle(new_cpu, p);
 	if (new_cpu != cpu) {
+		arch_task_migrate(p);
 		set_task_cpu(p, new_cpu);
 		task_rq_unlock(rq, &flags);
 		/* might preempt at this point */
@@ -1876,6 +1881,7 @@ void pull_task(runqueue_t *src_rq, prio_
 {
 	dequeue_task(p, src_array);
 	dec_nr_running(p, src_rq);
+	arch_task_migrate(p);
 	set_task_cpu(p, this_cpu);
 	inc_nr_running(p, this_rq);
 	enqueue_task(p, this_array);
@@ -4547,6 +4553,7 @@ static void __migrate_task(struct task_s
 	if (!cpu_isset(dest_cpu, p->cpus_allowed))
 		goto out;
 
+	arch_task_migrate(p);
 	set_task_cpu(p, dest_cpu);
 	if (p->array) {
 		/*

^ permalink raw reply related	[flat|nested] 10+ messages in thread

* Re: [PATCH] SN2 user-MMIO CPU migration
  2006-01-20  0:06 [PATCH] SN2 user-MMIO CPU migration Brent Casavant
@ 2006-01-20  2:18 ` Jesse Barnes
  2006-01-20  6:47   ` Brent Casavant
  2006-01-20 13:26   ` Jack Steiner
  2006-01-20  8:36 ` Ingo Molnar
  1 sibling, 2 replies; 10+ messages in thread
From: Jesse Barnes @ 2006-01-20  2:18 UTC (permalink / raw)
  To: Brent Casavant; +Cc: linux-ia64, linux-kernel, jes, tony.luck

On Thursday, January 19, 2006 4:06 pm, Brent Casavant wrote:
>  #ifndef __ARCH_WANT_UNLOCKED_CTXSW
>  static inline int task_running(runqueue_t *rq, task_t *p)
> @@ -936,6 +939,7 @@ static int migrate_task(task_t *p, int d
>  	 * it is sufficient to simply update the task's cpu field.
>  	 */
>  	if (!p->array && !task_running(rq, p)) {
> +		arch_task_migrate(p);
>  		set_task_cpu(p, dest_cpu);
>  		return 0;
>  	}
> @@ -1353,6 +1357,7 @@ static int try_to_wake_up(task_t *p, uns
>  out_set_cpu:
>  	new_cpu = wake_idle(new_cpu, p);
>  	if (new_cpu != cpu) {
> +		arch_task_migrate(p);
>  		set_task_cpu(p, new_cpu);
>  		task_rq_unlock(rq, &flags);
>  		/* might preempt at this point */
> @@ -1876,6 +1881,7 @@ void pull_task(runqueue_t *src_rq, prio_
>  {
>  	dequeue_task(p, src_array);
>  	dec_nr_running(p, src_rq);
> +	arch_task_migrate(p);
>  	set_task_cpu(p, this_cpu);
>  	inc_nr_running(p, this_rq);
>  	enqueue_task(p, this_array);
> @@ -4547,6 +4553,7 @@ static void __migrate_task(struct task_s
>  	if (!cpu_isset(dest_cpu, p->cpus_allowed))
>  		goto out;
>
> +	arch_task_migrate(p);
>  	set_task_cpu(p, dest_cpu);
>  	if (p->array) {
>  		/*

Maybe you could just turn the above into mmiowb() calls instead?  That 
would cover altix, origin, and ppc as well I think.  On other platforms 
it would be a complete no-op.

Jesse

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH] SN2 user-MMIO CPU migration
  2006-01-20  2:18 ` Jesse Barnes
@ 2006-01-20  6:47   ` Brent Casavant
  2006-01-20 17:36     ` Jesse Barnes
  2006-01-20 13:26   ` Jack Steiner
  1 sibling, 1 reply; 10+ messages in thread
From: Brent Casavant @ 2006-01-20  6:47 UTC (permalink / raw)
  To: Jesse Barnes; +Cc: linux-ia64, linux-kernel, jes, tony.luck

On Thu, 19 Jan 2006, Jesse Barnes wrote:

> Maybe you could just turn the above into mmiowb() calls instead?  That 
> would cover altix, origin, and ppc as well I think.  On other platforms 
> it would be a complete no-op.

As you obviously noted, the core of the code was lifted from mmiowb().
But no, an mmiowb() as such isn't correct.  At the time this code is
executing, it's on a CPU remote from the one which issued any PIO writes
to the device.  So in this case we need to poll the Shub register for
a remote node, but mmiowb() only polls for the Shub corresponding to
the current CPU.

My first incarnation of this patch (never publicly presented) did
implement a new mmiowb_remote(cpu) machvec instead, and this was
placed in the context-switch (in) path instead of the task migration
path.  However, since this behavior is only needed for the task
migration case, Jack Steiner pointed out that this was a more
appropriate way to implement it.  As migration is much less frequent
than context switching, this is a better-performing method to solve
the problem.

Thanks,
Brent

-- 
Brent Casavant                          All music is folk music.  I ain't
bcasavan@sgi.com                        never heard a horse sing a song.
Silicon Graphics, Inc.                    -- Louis Armstrong

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH] SN2 user-MMIO CPU migration
  2006-01-20  0:06 [PATCH] SN2 user-MMIO CPU migration Brent Casavant
  2006-01-20  2:18 ` Jesse Barnes
@ 2006-01-20  8:36 ` Ingo Molnar
  2006-01-20 16:14   ` Brent Casavant
  1 sibling, 1 reply; 10+ messages in thread
From: Ingo Molnar @ 2006-01-20  8:36 UTC (permalink / raw)
  To: Brent Casavant; +Cc: linux-ia64, linux-kernel, jes, tony.luck


* Brent Casavant <bcasavan@sgi.com> wrote:

> --- a/kernel/sched.c
> +++ b/kernel/sched.c
> @@ -291,6 +291,9 @@ for (domain = rcu_dereference(cpu_rq(cpu
>  #ifndef finish_arch_switch
>  # define finish_arch_switch(prev)	do { } while (0)
>  #endif
> +#ifndef arch_task_migrate
> +# define arch_task_migrate(task)	do { } while (0)
> +#endif

>  	if (!p->array && !task_running(rq, p)) {
> +		arch_task_migrate(p);
>  		set_task_cpu(p, dest_cpu);

>  	if (new_cpu != cpu) {
> +		arch_task_migrate(p);
>  		set_task_cpu(p, new_cpu);

>  	dec_nr_running(p, src_rq);
> +	arch_task_migrate(p);
>  	set_task_cpu(p, this_cpu);

> +	arch_task_migrate(p);
>  	set_task_cpu(p, dest_cpu);

hm, why isnt the synchronization done in switch_to()? Your arch-level 
switch_to() could have something like thread->last_cpu_sync, and if 
thread->last_cpu_sync != this_cpu, do the flush. This would not only 
keep this stuff out of the generic scheduler, but it would also optimize 
things a bit more: the moment we do a set_task_cpu() it does not mean 
that CPU _will_ run the task. Another CPU could grab that task later on.  
So we should delay such IO-synchronization to the last possible moment: 
when we know that we've hit a new CPU on which we havent done a flush 
yet. For same-CPU context switches there wouldnt be any extra 
synchronization, because thread->last_cpu_sync == this_cpu.

	Ingo

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH] SN2 user-MMIO CPU migration
  2006-01-20  2:18 ` Jesse Barnes
  2006-01-20  6:47   ` Brent Casavant
@ 2006-01-20 13:26   ` Jack Steiner
  2006-01-20 17:31     ` Jesse Barnes
  1 sibling, 1 reply; 10+ messages in thread
From: Jack Steiner @ 2006-01-20 13:26 UTC (permalink / raw)
  To: Jesse Barnes; +Cc: Brent Casavant, linux-ia64, linux-kernel, jes, tony.luck

On Thu, Jan 19, 2006 at 06:18:43PM -0800, Jesse Barnes wrote:
> On Thursday, January 19, 2006 4:06 pm, Brent Casavant wrote:
> >  #ifndef __ARCH_WANT_UNLOCKED_CTXSW
> >  static inline int task_running(runqueue_t *rq, task_t *p)
> > @@ -936,6 +939,7 @@ static int migrate_task(task_t *p, int d
> >  	 * it is sufficient to simply update the task's cpu field.
> >  	 */
> >  	if (!p->array && !task_running(rq, p)) {
> > +		arch_task_migrate(p);
> >  		set_task_cpu(p, dest_cpu);
> >  		return 0;
> >  	}
> > @@ -1353,6 +1357,7 @@ static int try_to_wake_up(task_t *p, uns
> >  out_set_cpu:
> >  	new_cpu = wake_idle(new_cpu, p);
> >  	if (new_cpu != cpu) {
> > +		arch_task_migrate(p);
> >  		set_task_cpu(p, new_cpu);
> >  		task_rq_unlock(rq, &flags);
> >  		/* might preempt at this point */
> > @@ -1876,6 +1881,7 @@ void pull_task(runqueue_t *src_rq, prio_
> >  {
> >  	dequeue_task(p, src_array);
> >  	dec_nr_running(p, src_rq);
> > +	arch_task_migrate(p);
> >  	set_task_cpu(p, this_cpu);
> >  	inc_nr_running(p, this_rq);
> >  	enqueue_task(p, this_array);
> > @@ -4547,6 +4553,7 @@ static void __migrate_task(struct task_s
> >  	if (!cpu_isset(dest_cpu, p->cpus_allowed))
> >  		goto out;
> >
> > +	arch_task_migrate(p);
> >  	set_task_cpu(p, dest_cpu);
> >  	if (p->array) {
> >  		/*
> 
> Maybe you could just turn the above into mmiowb() calls instead?  That 
> would cover altix, origin, and ppc as well I think.  On other platforms 
> it would be a complete no-op.
> 
> Jesse

I don't think calling mmiob() directly would work. In order to make 
CONFIG_IA64_GENERIC work, the call to mmiob() needs to be underneath a
platform vector. Using ia64_platform_is() would also work but I think
a platform vector is cleaner.

A second reason for an arch_task_migrate() instead of a specific mmiob() is
to provide a hook for a future platform that require additional work
to be done when a task migrates. 


-- 
Thanks

Jack Steiner (steiner@sgi.com)          651-683-5302



^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH] SN2 user-MMIO CPU migration
  2006-01-20  8:36 ` Ingo Molnar
@ 2006-01-20 16:14   ` Brent Casavant
  0 siblings, 0 replies; 10+ messages in thread
From: Brent Casavant @ 2006-01-20 16:14 UTC (permalink / raw)
  To: Ingo Molnar; +Cc: linux-ia64, linux-kernel, jes, tony.luck

On Fri, 20 Jan 2006, Ingo Molnar wrote:

> hm, why isnt the synchronization done in switch_to()? Your arch-level 
> switch_to() could have something like thread->last_cpu_sync, and if 
> thread->last_cpu_sync != this_cpu, do the flush. This would not only 
> keep this stuff out of the generic scheduler, but it would also optimize 
> things a bit more: the moment we do a set_task_cpu() it does not mean 
> that CPU _will_ run the task. Another CPU could grab that task later on.  

Very good points all around.  I'll rework the changes in just the
manner you mentioned.

Brent

-- 
Brent Casavant                          All music is folk music.  I ain't
bcasavan@sgi.com                        never heard a horse sing a song.
Silicon Graphics, Inc.                    -- Louis Armstrong

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH] SN2 user-MMIO CPU migration
  2006-01-20 13:26   ` Jack Steiner
@ 2006-01-20 17:31     ` Jesse Barnes
  2006-01-20 19:00       ` Jack Steiner
  0 siblings, 1 reply; 10+ messages in thread
From: Jesse Barnes @ 2006-01-20 17:31 UTC (permalink / raw)
  To: Jack Steiner; +Cc: Brent Casavant, linux-ia64, linux-kernel, jes, tony.luck

On Friday, January 20, 2006 5:26 am, Jack Steiner wrote:
> I don't think calling mmiob() directly would work. In order to make
> CONFIG_IA64_GENERIC work, the call to mmiob() needs to be underneath a
> platform vector. Using ia64_platform_is() would also work but I think
> a platform vector is cleaner.

mmiowb is already a platform vector on ia64, so I think you're ok there.

> A second reason for an arch_task_migrate() instead of a specific
> mmiob() is to provide a hook for a future platform that require
> additional work to be done when a task migrates.

What does the new platform require (just curious)?

Jesse

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH] SN2 user-MMIO CPU migration
  2006-01-20  6:47   ` Brent Casavant
@ 2006-01-20 17:36     ` Jesse Barnes
  2006-01-20 20:01       ` Brent Casavant
  0 siblings, 1 reply; 10+ messages in thread
From: Jesse Barnes @ 2006-01-20 17:36 UTC (permalink / raw)
  To: Brent Casavant; +Cc: linux-ia64, linux-kernel, jes, tony.luck

On Thursday, January 19, 2006 10:47 pm, Brent Casavant wrote:
> On Thu, 19 Jan 2006, Jesse Barnes wrote:
> > Maybe you could just turn the above into mmiowb() calls instead? 
> > That would cover altix, origin, and ppc as well I think.  On other
> > platforms it would be a complete no-op.
>
> As you obviously noted, the core of the code was lifted from mmiowb().
> But no, an mmiowb() as such isn't correct.  At the time this code is
> executing, it's on a CPU remote from the one which issued any PIO
> writes to the device.  So in this case we need to poll the Shub
> register for a remote node, but mmiowb() only polls for the Shub
> corresponding to the current CPU.

Ah, ok.  It sounds like Ingo might have a better place to put it anyway.  
(I was thinking this was on the switch out path on the CPU where the 
task last ran, didn't look at it in detail.)

Of course, the other option is just to require tasks that do MMIO 
accesses from userspace to be pinned to particular CPU or node. :)

Thanks,
Jesse

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH] SN2 user-MMIO CPU migration
  2006-01-20 17:31     ` Jesse Barnes
@ 2006-01-20 19:00       ` Jack Steiner
  0 siblings, 0 replies; 10+ messages in thread
From: Jack Steiner @ 2006-01-20 19:00 UTC (permalink / raw)
  To: Jesse Barnes; +Cc: Brent Casavant, linux-ia64, linux-kernel, jes, tony.luck

> > A second reason for an arch_task_migrate() instead of a specific
> > mmiob() is to provide a hook for a future platform that require
> > additional work to be done when a task migrates.
> 
> What does the new platform require (just curious)?
> 
> Jesse

Sorry, can't say. Rejoin SGI & I'll tell you :-)

---
Jack 



^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH] SN2 user-MMIO CPU migration
  2006-01-20 17:36     ` Jesse Barnes
@ 2006-01-20 20:01       ` Brent Casavant
  0 siblings, 0 replies; 10+ messages in thread
From: Brent Casavant @ 2006-01-20 20:01 UTC (permalink / raw)
  To: Jesse Barnes; +Cc: linux-ia64, linux-kernel, jes, tony.luck

On Fri, 20 Jan 2006, Jesse Barnes wrote:

> Of course, the other option is just to require tasks that do MMIO 
> accesses from userspace to be pinned to particular CPU or node. :)

One idea I had was to add a counter into the mm struct that gets
bumped if the process performs any MMIO mappings, so that only
affected processes pay the penalty.  However, the added complexity
in the drivers (e.g. handling partial unmaps, etc.) doesn't seem worth
it.  On average this code adds 800ns to the task migration path, which
is relatively infrequent and already a bit expensive (what with cold
caches and the like).

Regarding the direction Ingo sent me down, and considering what Jack
said about needing a hook for a future platform, I'm thinking of grabbing
a bit in task->thread.flags that IA64_HAS_EXTRA_STATE() could detect and
let ia64_{save,load}_extra() call new machvecs to perform this
chipset-specific context management.  It's a bit overengineered for
my particular case, but would allow Jack to plug in his work very
cleanly.

Brent

-- 
Brent Casavant                          All music is folk music.  I ain't
bcasavan@sgi.com                        never heard a horse sing a song.
Silicon Graphics, Inc.                    -- Louis Armstrong

^ permalink raw reply	[flat|nested] 10+ messages in thread

end of thread, other threads:[~2006-01-20 20:01 UTC | newest]

Thread overview: 10+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2006-01-20  0:06 [PATCH] SN2 user-MMIO CPU migration Brent Casavant
2006-01-20  2:18 ` Jesse Barnes
2006-01-20  6:47   ` Brent Casavant
2006-01-20 17:36     ` Jesse Barnes
2006-01-20 20:01       ` Brent Casavant
2006-01-20 13:26   ` Jack Steiner
2006-01-20 17:31     ` Jesse Barnes
2006-01-20 19:00       ` Jack Steiner
2006-01-20  8:36 ` Ingo Molnar
2006-01-20 16:14   ` Brent Casavant

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).