All of lore.kernel.org
 help / color / mirror / Atom feed
* BUG: commit "ARM: Remove __ARCH_WANT_INTERRUPTS_ON_CTXSW on pre-ARMv6 CPUs" breaks armv5 with CONFIG_PREEMPT
@ 2013-06-20  8:43 Marc Kleine-Budde
  2013-06-20  9:25 ` Will Deacon
  2013-06-20  9:57 ` Catalin Marinas
  0 siblings, 2 replies; 20+ messages in thread
From: Marc Kleine-Budde @ 2013-06-20  8:43 UTC (permalink / raw)
  To: linux-arm-kernel

Hello Catalin,

on current linus/master on armv5 we observed stack trashing[1] on high
load. I bisected this problem down to commit:

> b9d4d42ad901cc848ac87f1cb8923fded3645568 is the first bad commit
> commit b9d4d42ad901cc848ac87f1cb8923fded3645568
> Author: Catalin Marinas <catalin.marinas@arm.com>
> Date:   Mon Nov 28 21:57:24 2011 +0000
> 
>     ARM: Remove __ARCH_WANT_INTERRUPTS_ON_CTXSW on pre-ARMv6 CPUs
> 
>     This patch removes the __ARCH_WANT_INTERRUPTS_ON_CTXSW definition for
>     ARMv5 and earlier processors. On such processors, the context switch
>     requires a full cache flush. To avoid high interrupt latencies, this
>     patch defers the mm switching to the post-lock switch hook if the
>     interrupts are disabled.
> 
>     Reviewed-by: Will Deacon <will.deacon@arm.com>
>     Tested-by: Will Deacon <will.deacon@arm.com>
>     Reviewed-by: Frank Rowand <frank.rowand@am.sony.com>
>     Tested-by: Marc Zyngier <Marc.Zyngier@arm.com>
>     Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
> 
> :040000 040000 034899bdcbc9aa59b5455a85a9d78b646b4cf784 ecc23e33a4ca807d4153f87fbea85a9437ff2928 M      arch

The problem can be reproduced on several mx28 and an at91sam9263 and
only occurs of CONFIG_PREEMPT (Preemptible Kernel (Low-Latency Desktop))
is enabled.

I have the gut feeling that the "if (irqs_disabled())" check in the
above patch is not correct for CONFIG_PREEMPT.

Marc

[1] The test program creating 150 threads in a row crashes like this:
> [8]+  Segmentation fault         ./thread_test_c
> [5]   Illegal instruction        ./thread_test_c
> [4]   Illegal instruction        ./thread_test_c
> [2]   Segmentation fault         ./thread_test_c
> [3]   Illegal instruction        ./thread_test_c
> [1]   Segmentation fault         ./thread_test_c

-- 
Pengutronix e.K.                  | Marc Kleine-Budde           |
Industrial Linux Solutions        | Phone: +49-231-2826-924     |
Vertretung West/Dortmund          | Fax:   +49-5121-206917-5555 |
Amtsgericht Hildesheim, HRA 2686  | http://www.pengutronix.de   |

-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 259 bytes
Desc: OpenPGP digital signature
URL: <http://lists.infradead.org/pipermail/linux-arm-kernel/attachments/20130620/11a01756/attachment-0001.sig>

^ permalink raw reply	[flat|nested] 20+ messages in thread

* BUG: commit "ARM: Remove __ARCH_WANT_INTERRUPTS_ON_CTXSW on pre-ARMv6 CPUs" breaks armv5 with CONFIG_PREEMPT
  2013-06-20  8:43 BUG: commit "ARM: Remove __ARCH_WANT_INTERRUPTS_ON_CTXSW on pre-ARMv6 CPUs" breaks armv5 with CONFIG_PREEMPT Marc Kleine-Budde
@ 2013-06-20  9:25 ` Will Deacon
  2013-06-20  9:51   ` Marc Kleine-Budde
  2013-06-20  9:57 ` Catalin Marinas
  1 sibling, 1 reply; 20+ messages in thread
From: Will Deacon @ 2013-06-20  9:25 UTC (permalink / raw)
  To: linux-arm-kernel

Hi Marc,

On Thu, Jun 20, 2013 at 09:43:33AM +0100, Marc Kleine-Budde wrote:
> The problem can be reproduced on several mx28 and an at91sam9263 and
> only occurs of CONFIG_PREEMPT (Preemptible Kernel (Low-Latency Desktop))
> is enabled.
> 
> I have the gut feeling that the "if (irqs_disabled())" check in the
> above patch is not correct for CONFIG_PREEMPT.
> 
> Marc
> 
> [1] The test program creating 150 threads in a row crashes like this:
> > [8]+  Segmentation fault         ./thread_test_c
> > [5]   Illegal instruction        ./thread_test_c
> > [4]   Illegal instruction        ./thread_test_c
> > [2]   Segmentation fault         ./thread_test_c
> > [3]   Illegal instruction        ./thread_test_c
> > [1]   Segmentation fault         ./thread_test_c

Any chance you can make your test available somewhere please? I remember
testing on a 926 when the patch was written, so it could be a bad
interaction with some later changes.

Cheers,

Will

^ permalink raw reply	[flat|nested] 20+ messages in thread

* BUG: commit "ARM: Remove __ARCH_WANT_INTERRUPTS_ON_CTXSW on pre-ARMv6 CPUs" breaks armv5 with CONFIG_PREEMPT
  2013-06-20  9:25 ` Will Deacon
@ 2013-06-20  9:51   ` Marc Kleine-Budde
  0 siblings, 0 replies; 20+ messages in thread
From: Marc Kleine-Budde @ 2013-06-20  9:51 UTC (permalink / raw)
  To: linux-arm-kernel

On 06/20/2013 11:25 AM, Will Deacon wrote:
> On Thu, Jun 20, 2013 at 09:43:33AM +0100, Marc Kleine-Budde wrote:
>> The problem can be reproduced on several mx28 and an at91sam9263 and
>> only occurs of CONFIG_PREEMPT (Preemptible Kernel (Low-Latency Desktop))
>> is enabled.
>>
>> I have the gut feeling that the "if (irqs_disabled())" check in the
>> above patch is not correct for CONFIG_PREEMPT.
>>
>> Marc
>>
>> [1] The test program creating 150 threads in a row crashes like this:
>>> [8]+  Segmentation fault         ./thread_test_c
>>> [5]   Illegal instruction        ./thread_test_c
>>> [4]   Illegal instruction        ./thread_test_c
>>> [2]   Segmentation fault         ./thread_test_c
>>> [3]   Illegal instruction        ./thread_test_c
>>> [1]   Segmentation fault         ./thread_test_c
> 
> Any chance you can make your test available somewhere please? I remember
> testing on a 926 when the patch was written, so it could be a bad
> interaction with some later changes.

Test attached. Nothing fancy. compile with:

    gcc -Wall -O0 main-c.c -o main-c -lpthread -lrt

Then start several (here 16) of these tests in parallel in the background:

> root at halo22:~ ./main-c &
> root at halo22:~ ./main-c &
> root at halo22:~ ./main-c &
> root at halo22:~ ./main-c &
> root at halo22:~ ./main-c &
> root at halo22:~ ./main-c &
> root at halo22:~ ./main-c &
> root at halo22:~ ./main-c &
> root at halo22:~ ./main-c &
> root at halo22:~ ./main-c &
> root at halo22:~ ./main-c &
> root at halo22:~ ./main-c &
> root at halo22:~ ./main-c &
> root at halo22:~ ./main-c &
> root at halo22:~ ./main-c &
> root at halo22:~ ./main-c &
> root at halo22:~
> root at halo22:~ jobs
> [16]+  Running                   ./main-c
> [15]-  Running                   ./main-c
> [14]   Running                   ./main-c
> [13]   Running                   ./main-c
> [12]   Running                   ./main-c
> [11]   Running                   ./main-c
> [10]   Running                   ./main-c
> [9]   Running                    ./main-c
> [8]   Running                    ./main-c
> [7]   Running                    ./main-c
> [6]   Running                    ./main-c
> [5]   Running                    ./main-c
> [4]   Running                    ./main-c
> [3]   Running                    ./main-c
> [2]   Running                    ./main-c
> [1]   Running                    ./main-c
> root at halo22:~ jobs
> [16]+  Running                   ./main-c
> [15]-  Running                   ./main-c
> [14]   Running                   ./main-c
> [13]   Running                   ./main-c
> [12]   Running                   ./main-c
> [11]   Running                   ./main-c
> [10]   Running                   ./main-c
> [9]   Running                    ./main-c
> [8]   Running                    ./main-c
> [7]   Running                    ./main-c
> [6]   Running                    ./main-c
> [5]   Running                    ./main-c
> [4]   Segmentation fault         ./main-c
> [3]   Running                    ./main-c
> [2]   Running                    ./main-c
> [1]   Running                    ./main-c
> root at halo22:~ jobs
> [16]+  Running                   ./main-c
> [15]-  Segmentation fault        ./main-c
> [14]-  Segmentation fault        ./main-c
> [13]-  Running                   ./main-c
> [12]   Running                   ./main-c
> [11]   Segmentation fault        ./main-c
> [10]   Segmentation fault        ./main-c
> [9]   Running                    ./main-c
> [8]   Running                    ./main-c
> [7]   Segmentation fault         ./main-c
> [6]   Illegal instruction        ./main-c
> [5]   Segmentation fault         ./main-c
> [3]   Segmentation fault         ./main-c
> [2]   Illegal instruction        ./main-c
> [1]   Running                    ./main-c
> [16]+  Illegal instruction       ./main-c

However during my tests it shows that it crashes on mx28 (400 MHz)
within seconds, on an at91sam9263 (200 MHx) it takes up to several
minutes. The current number of threads and time is sleeps must hit a
"sweet spot" on an usual mx28.

I have attached the defconfig I used with linus/master on mx28evk.  This
is the basically the arch/arm/configs/mxs_defconfig with preemption
model set to CONFIG_PREEMPT (Preemptible Kernel (Low-Latency Desktop)).

regards,
Marc

-- 
Pengutronix e.K.                  | Marc Kleine-Budde           |
Industrial Linux Solutions        | Phone: +49-231-2826-924     |
Vertretung West/Dortmund          | Fax:   +49-5121-206917-5555 |
Amtsgericht Hildesheim, HRA 2686  | http://www.pengutronix.de   |
-------------- next part --------------
A non-text attachment was scrubbed...
Name: main-c.c
Type: text/x-csrc
Size: 1906 bytes
Desc: not available
URL: <http://lists.infradead.org/pipermail/linux-arm-kernel/attachments/20130620/1a2634d5/attachment.bin>
-------------- next part --------------
CONFIG_EXPERIMENTAL=y
CONFIG_SYSVIPC=y
CONFIG_TASKSTATS=y
CONFIG_TASK_DELAY_ACCT=y
CONFIG_TASK_XACCT=y
CONFIG_TASK_IO_ACCOUNTING=y
CONFIG_IKCONFIG=y
CONFIG_IKCONFIG_PROC=y
# CONFIG_UTS_NS is not set
# CONFIG_IPC_NS is not set
# CONFIG_PID_NS is not set
# CONFIG_NET_NS is not set
CONFIG_PERF_EVENTS=y
# CONFIG_COMPAT_BRK is not set
CONFIG_MODULES=y
CONFIG_MODULE_FORCE_LOAD=y
CONFIG_MODULE_UNLOAD=y
CONFIG_MODULE_FORCE_UNLOAD=y
CONFIG_MODVERSIONS=y
CONFIG_BLK_DEV_INTEGRITY=y
# CONFIG_IOSCHED_DEADLINE is not set
# CONFIG_IOSCHED_CFQ is not set
CONFIG_ARCH_MXS=y
CONFIG_MACH_STMP378X_DEVB=y
CONFIG_MACH_MX23EVK=y
CONFIG_MACH_MX28EVK=y
CONFIG_MACH_TX28=y
CONFIG_MACH_M28EVK=y
# CONFIG_ARM_THUMB is not set
CONFIG_NO_HZ=y
CONFIG_HIGH_RES_TIMERS=y
CONFIG_PREEMPT=y
CONFIG_AEABI=y
CONFIG_USE_OF=y
CONFIG_AUTO_ZRELADDR=y
CONFIG_FPE_NWFPE=y
CONFIG_NET=y
CONFIG_PACKET=y
CONFIG_UNIX=y
CONFIG_INET=y
CONFIG_IP_PNP=y
CONFIG_IP_PNP_DHCP=y
CONFIG_SYN_COOKIES=y
# CONFIG_INET_XFRM_MODE_TRANSPORT is not set
# CONFIG_INET_XFRM_MODE_TUNNEL is not set
# CONFIG_INET_XFRM_MODE_BEET is not set
# CONFIG_INET_LRO is not set
# CONFIG_INET_DIAG is not set
# CONFIG_IPV6 is not set
CONFIG_CAN=m
CONFIG_CAN_RAW=m
CONFIG_CAN_BCM=m
CONFIG_CAN_FLEXCAN=m
# CONFIG_WIRELESS is not set
CONFIG_DEVTMPFS=y
# CONFIG_FIRMWARE_IN_KERNEL is not set
# CONFIG_BLK_DEV is not set
CONFIG_NETDEVICES=y
CONFIG_ENC28J60=y
# CONFIG_WLAN is not set
# CONFIG_INPUT_MOUSEDEV_PSAUX is not set
CONFIG_INPUT_EVDEV=m
# CONFIG_INPUT_KEYBOARD is not set
# CONFIG_INPUT_MOUSE is not set
CONFIG_INPUT_TOUCHSCREEN=y
CONFIG_TOUCHSCREEN_TSC2007=m
# CONFIG_SERIO is not set
CONFIG_VT_HW_CONSOLE_BINDING=y
CONFIG_DEVPTS_MULTIPLE_INSTANCES=y
# CONFIG_LEGACY_PTYS is not set
# CONFIG_DEVKMEM is not set
CONFIG_SERIAL_AMBA_PL011=y
CONFIG_SERIAL_AMBA_PL011_CONSOLE=y
# CONFIG_HW_RANDOM is not set
CONFIG_I2C=y
# CONFIG_I2C_COMPAT is not set
CONFIG_I2C_CHARDEV=y
CONFIG_I2C_MXS=y
CONFIG_SPI=y
CONFIG_SPI_GPIO=m
CONFIG_DEBUG_GPIO=y
CONFIG_GPIO_SYSFS=y
# CONFIG_HWMON is not set
CONFIG_REGULATOR=y
CONFIG_REGULATOR_FIXED_VOLTAGE=y
CONFIG_SOUND=y
CONFIG_SND=y
CONFIG_SND_SOC=y
CONFIG_SND_MXS_SOC=y
CONFIG_SND_SOC_MXS_SGTL5000=y
# CONFIG_USB_SUPPORT is not set
CONFIG_MMC=y
CONFIG_MMC_MXS=y
CONFIG_RTC_CLASS=y
CONFIG_RTC_DRV_DS1307=m
CONFIG_DMADEVICES=y
CONFIG_MXS_DMA=y
CONFIG_EXT3_FS=y
# CONFIG_DNOTIFY is not set
CONFIG_FSCACHE=m
CONFIG_FSCACHE_STATS=y
CONFIG_CACHEFILES=m
CONFIG_TMPFS=y
CONFIG_TMPFS_POSIX_ACL=y
# CONFIG_MISC_FILESYSTEMS is not set
CONFIG_NFS_FS=y
CONFIG_NFS_V3=y
CONFIG_NFS_V3_ACL=y
CONFIG_NFS_V4=y
CONFIG_ROOT_NFS=y
CONFIG_PRINTK_TIME=y
CONFIG_FRAME_WARN=2048
CONFIG_MAGIC_SYSRQ=y
CONFIG_UNUSED_SYMBOLS=y
CONFIG_DEBUG_KERNEL=y
CONFIG_LOCKUP_DETECTOR=y
CONFIG_TIMER_STATS=y
CONFIG_PROVE_LOCKING=y
CONFIG_DEBUG_INFO=y
CONFIG_BLK_DEV_IO_TRACE=y
CONFIG_STRICT_DEVMEM=y
CONFIG_DEBUG_USER=y
CONFIG_CRYPTO=y
CONFIG_CRYPTO_CRC32C=m
# CONFIG_CRYPTO_ANSI_CPRNG is not set
# CONFIG_CRYPTO_HW is not set
CONFIG_CRC_ITU_T=m
CONFIG_CRC7=m
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 259 bytes
Desc: OpenPGP digital signature
URL: <http://lists.infradead.org/pipermail/linux-arm-kernel/attachments/20130620/1a2634d5/attachment.sig>

^ permalink raw reply	[flat|nested] 20+ messages in thread

* BUG: commit "ARM: Remove __ARCH_WANT_INTERRUPTS_ON_CTXSW on pre-ARMv6 CPUs" breaks armv5 with CONFIG_PREEMPT
  2013-06-20  8:43 BUG: commit "ARM: Remove __ARCH_WANT_INTERRUPTS_ON_CTXSW on pre-ARMv6 CPUs" breaks armv5 with CONFIG_PREEMPT Marc Kleine-Budde
  2013-06-20  9:25 ` Will Deacon
@ 2013-06-20  9:57 ` Catalin Marinas
  2013-06-20 10:08   ` Marc Kleine-Budde
  2013-06-20 10:14   ` Marc Kleine-Budde
  1 sibling, 2 replies; 20+ messages in thread
From: Catalin Marinas @ 2013-06-20  9:57 UTC (permalink / raw)
  To: linux-arm-kernel

Hi Marc,

On Thu, Jun 20, 2013 at 09:43:33AM +0100, Marc Kleine-Budde wrote:
> on current linus/master on armv5 we observed stack trashing[1] on high
> load. I bisected this problem down to commit:
> 
> > b9d4d42ad901cc848ac87f1cb8923fded3645568 is the first bad commit
> > commit b9d4d42ad901cc848ac87f1cb8923fded3645568
> > Author: Catalin Marinas <catalin.marinas@arm.com>
> > Date:   Mon Nov 28 21:57:24 2011 +0000
> > 
> >     ARM: Remove __ARCH_WANT_INTERRUPTS_ON_CTXSW on pre-ARMv6 CPUs
> > 
> >     This patch removes the __ARCH_WANT_INTERRUPTS_ON_CTXSW definition for
> >     ARMv5 and earlier processors. On such processors, the context switch
> >     requires a full cache flush. To avoid high interrupt latencies, this
> >     patch defers the mm switching to the post-lock switch hook if the
> >     interrupts are disabled.
> > 
> >     Reviewed-by: Will Deacon <will.deacon@arm.com>
> >     Tested-by: Will Deacon <will.deacon@arm.com>
> >     Reviewed-by: Frank Rowand <frank.rowand@am.sony.com>
> >     Tested-by: Marc Zyngier <Marc.Zyngier@arm.com>
> >     Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
> > 
> > :040000 040000 034899bdcbc9aa59b5455a85a9d78b646b4cf784 ecc23e33a4ca807d4153f87fbea85a9437ff2928 M      arch
> 
> The problem can be reproduced on several mx28 and an at91sam9263 and
> only occurs of CONFIG_PREEMPT (Preemptible Kernel (Low-Latency Desktop))
> is enabled.
> 
> I have the gut feeling that the "if (irqs_disabled())" check in the
> above patch is not correct for CONFIG_PREEMPT.

The check is there to avoid long interrupt latencies (flushing the whole
cache with interrupts disabled during context switch). You can drop the
check and always call cpu_switch_mm() to confirm that it fixes the
faults.

finish_task_switch() calls finish_arch_post_lock_switch() after the
interrupts have been enabled so that the CPU can actually switch the mm.
I wonder whether we could actually be preempted after
finish_lock_switch() but before we actually switched the MMU.

Here's an untested patch (trying to keep it in the arch/arm code):


diff --git a/arch/arm/include/asm/mmu_context.h b/arch/arm/include/asm/mmu_context.h
index a7b85e0..ded85e9 100644
--- a/arch/arm/include/asm/mmu_context.h
+++ b/arch/arm/include/asm/mmu_context.h
@@ -39,17 +39,20 @@ static inline void check_and_switch_context(struct mm_struct *mm,
 	if (unlikely(mm->context.vmalloc_seq != init_mm.context.vmalloc_seq))
 		__check_vmalloc_seq(mm);
 
-	if (irqs_disabled())
+	if (irqs_disabled()) {
 		/*
 		 * cpu_switch_mm() needs to flush the VIVT caches. To avoid
 		 * high interrupt latencies, defer the call and continue
 		 * running with the old mm. Since we only support UP systems
 		 * on non-ASID CPUs, the old mm will remain valid until the
-		 * finish_arch_post_lock_switch() call.
+		 * finish_arch_post_lock_switch() call. Preemption needs to be
+		 * disabled until the MMU is switched.
 		 */
 		set_ti_thread_flag(task_thread_info(tsk), TIF_SWITCH_MM);
-	else
+		preempt_disable();
+	} else {
 		cpu_switch_mm(mm->pgd, mm);
+	}
 }
 
 #define finish_arch_post_lock_switch \
@@ -59,6 +62,10 @@ static inline void finish_arch_post_lock_switch(void)
 	if (test_and_clear_thread_flag(TIF_SWITCH_MM)) {
 		struct mm_struct *mm = current->mm;
 		cpu_switch_mm(mm->pgd, mm);
+		/*
+		 * Preemption disabled in check_and_switch_context().
+		 */
+		preempt_enable();
 	}
 }
 

^ permalink raw reply related	[flat|nested] 20+ messages in thread

* BUG: commit "ARM: Remove __ARCH_WANT_INTERRUPTS_ON_CTXSW on pre-ARMv6 CPUs" breaks armv5 with CONFIG_PREEMPT
  2013-06-20  9:57 ` Catalin Marinas
@ 2013-06-20 10:08   ` Marc Kleine-Budde
  2013-06-20 10:14   ` Marc Kleine-Budde
  1 sibling, 0 replies; 20+ messages in thread
From: Marc Kleine-Budde @ 2013-06-20 10:08 UTC (permalink / raw)
  To: linux-arm-kernel

On 06/20/2013 11:57 AM, Catalin Marinas wrote:
> Hi Marc,
> 
> On Thu, Jun 20, 2013 at 09:43:33AM +0100, Marc Kleine-Budde wrote:
>> on current linus/master on armv5 we observed stack trashing[1] on high
>> load. I bisected this problem down to commit:
>>
>>> b9d4d42ad901cc848ac87f1cb8923fded3645568 is the first bad commit
>>> commit b9d4d42ad901cc848ac87f1cb8923fded3645568
>>> Author: Catalin Marinas <catalin.marinas@arm.com>
>>> Date:   Mon Nov 28 21:57:24 2011 +0000
>>>
>>>     ARM: Remove __ARCH_WANT_INTERRUPTS_ON_CTXSW on pre-ARMv6 CPUs
>>>
>>>     This patch removes the __ARCH_WANT_INTERRUPTS_ON_CTXSW definition for
>>>     ARMv5 and earlier processors. On such processors, the context switch
>>>     requires a full cache flush. To avoid high interrupt latencies, this
>>>     patch defers the mm switching to the post-lock switch hook if the
>>>     interrupts are disabled.
>>>
>>>     Reviewed-by: Will Deacon <will.deacon@arm.com>
>>>     Tested-by: Will Deacon <will.deacon@arm.com>
>>>     Reviewed-by: Frank Rowand <frank.rowand@am.sony.com>
>>>     Tested-by: Marc Zyngier <Marc.Zyngier@arm.com>
>>>     Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
>>>
>>> :040000 040000 034899bdcbc9aa59b5455a85a9d78b646b4cf784 ecc23e33a4ca807d4153f87fbea85a9437ff2928 M      arch
>>
>> The problem can be reproduced on several mx28 and an at91sam9263 and
>> only occurs of CONFIG_PREEMPT (Preemptible Kernel (Low-Latency Desktop))
>> is enabled.
>>
>> I have the gut feeling that the "if (irqs_disabled())" check in the
>> above patch is not correct for CONFIG_PREEMPT.
> 
> The check is there to avoid long interrupt latencies (flushing the whole
> cache with interrupts disabled during context switch). You can drop the
> check and always call cpu_switch_mm() to confirm that it fixes the
> faults.

I have disabled the check:

diff --git a/arch/arm/include/asm/mmu_context.h b/arch/arm/include/asm/mmu_context.h
index a7b85e0..7b3f67f 100644
--- a/arch/arm/include/asm/mmu_context.h
+++ b/arch/arm/include/asm/mmu_context.h
@@ -39,7 +39,7 @@ static inline void check_and_switch_context(struct mm_struct *mm,
        if (unlikely(mm->context.vmalloc_seq != init_mm.context.vmalloc_seq))
                __check_vmalloc_seq(mm);

-       if (irqs_disabled())
+       if (0 && irqs_disabled())
                /*
                 * cpu_switch_mm() needs to flush the VIVT caches. To avoid
                 * high interrupt latencies, defer the call and continue

...and the test is running without problems, while I'm writing this
email. No more segfaults so far.

> finish_task_switch() calls finish_arch_post_lock_switch() after the
> interrupts have been enabled so that the CPU can actually switch the mm.
> I wonder whether we could actually be preempted after
> finish_lock_switch() but before we actually switched the MMU.
> 
> Here's an untested patch (trying to keep it in the arch/arm code):
> 
> 
> diff --git a/arch/arm/include/asm/mmu_context.h b/arch/arm/include/asm/mmu_context.h
> index a7b85e0..ded85e9 100644
> --- a/arch/arm/include/asm/mmu_context.h
> +++ b/arch/arm/include/asm/mmu_context.h
> @@ -39,17 +39,20 @@ static inline void check_and_switch_context(struct mm_struct *mm,
>  	if (unlikely(mm->context.vmalloc_seq != init_mm.context.vmalloc_seq))
>  		__check_vmalloc_seq(mm);
>  
> -	if (irqs_disabled())
> +	if (irqs_disabled()) {
>  		/*
>  		 * cpu_switch_mm() needs to flush the VIVT caches. To avoid
>  		 * high interrupt latencies, defer the call and continue
>  		 * running with the old mm. Since we only support UP systems
>  		 * on non-ASID CPUs, the old mm will remain valid until the
> -		 * finish_arch_post_lock_switch() call.
> +		 * finish_arch_post_lock_switch() call. Preemption needs to be
> +		 * disabled until the MMU is switched.
>  		 */
>  		set_ti_thread_flag(task_thread_info(tsk), TIF_SWITCH_MM);
> -	else
> +		preempt_disable();
> +	} else {
>  		cpu_switch_mm(mm->pgd, mm);
> +	}
>  }
>  
>  #define finish_arch_post_lock_switch \
> @@ -59,6 +62,10 @@ static inline void finish_arch_post_lock_switch(void)
>  	if (test_and_clear_thread_flag(TIF_SWITCH_MM)) {
>  		struct mm_struct *mm = current->mm;
>  		cpu_switch_mm(mm->pgd, mm);
> +		/*
> +		 * Preemption disabled in check_and_switch_context().
> +		 */
> +		preempt_enable();
>  	}
>  }

Will now reboot to test your patch.

thanks,
Marc
-- 
Pengutronix e.K.                  | Marc Kleine-Budde           |
Industrial Linux Solutions        | Phone: +49-231-2826-924     |
Vertretung West/Dortmund          | Fax:   +49-5121-206917-5555 |
Amtsgericht Hildesheim, HRA 2686  | http://www.pengutronix.de   |

-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 259 bytes
Desc: OpenPGP digital signature
URL: <http://lists.infradead.org/pipermail/linux-arm-kernel/attachments/20130620/510c114b/attachment.sig>

^ permalink raw reply related	[flat|nested] 20+ messages in thread

* BUG: commit "ARM: Remove __ARCH_WANT_INTERRUPTS_ON_CTXSW on pre-ARMv6 CPUs" breaks armv5 with CONFIG_PREEMPT
  2013-06-20  9:57 ` Catalin Marinas
  2013-06-20 10:08   ` Marc Kleine-Budde
@ 2013-06-20 10:14   ` Marc Kleine-Budde
  2013-06-20 10:28     ` Catalin Marinas
  1 sibling, 1 reply; 20+ messages in thread
From: Marc Kleine-Budde @ 2013-06-20 10:14 UTC (permalink / raw)
  To: linux-arm-kernel

On 06/20/2013 11:57 AM, Catalin Marinas wrote:
>>> :040000 040000 034899bdcbc9aa59b5455a85a9d78b646b4cf784 ecc23e33a4ca807d4153f87fbea85a9437ff2928 M      arch
>>
>> The problem can be reproduced on several mx28 and an at91sam9263 and
>> only occurs of CONFIG_PREEMPT (Preemptible Kernel (Low-Latency Desktop))
>> is enabled.
>>
>> I have the gut feeling that the "if (irqs_disabled())" check in the
>> above patch is not correct for CONFIG_PREEMPT.
> 
> The check is there to avoid long interrupt latencies (flushing the whole
> cache with interrupts disabled during context switch). You can drop the
> check and always call cpu_switch_mm() to confirm that it fixes the
> faults.
> 
> finish_task_switch() calls finish_arch_post_lock_switch() after the
> interrupts have been enabled so that the CPU can actually switch the mm.
> I wonder whether we could actually be preempted after
> finish_lock_switch() but before we actually switched the MMU.
> 
> Here's an untested patch (trying to keep it in the arch/arm code):

[...]

When stating userspace I get this reproducible:

init started: BusyBox v1.18.5 (2013-04-04 18:31:36 CEST)
starting pid 50, tty '/dev/console': '/etc/init.d/rcS'
[    7.000828] ------------[ cut here ]------------
[    7.005560] WARNING: at kernel/sched/core.c:2809 finish_task_switch.constprop.84+0xf8/0x164()
[    7.014650] DEBUG_LOCKS_WARN_ON(val > preempt_count())
[    7.019668] Modules linked in:
[    7.023152] CPU: 0 PID: 51 Comm: rcS Not tainted 3.10.0-rc6-00085-gf36ec4f #20
[    7.030923] [<c0013db4>] (unwind_backtrace+0x0/0xf0) from [<c0011b5c>] (show_stack+0x10/0x14)
[    7.039563] [<c0011b5c>] (show_stack+0x10/0x14) from [<c001bed4>] (warn_slowpath_common+0x4c/0x68)
[    7.048732] [<c001bed4>] (warn_slowpath_common+0x4c/0x68) from [<c001bf84>] (warn_slowpath_fmt+0x30/0x40)
[    7.058505] [<c001bf84>] (warn_slowpath_fmt+0x30/0x40) from [<c0047a3c>] (finish_task_switch.constprop.84+0xf8/0x164)
[    7.069303] [<c0047a3c>] (finish_task_switch.constprop.84+0xf8/0x164) from [<c0047d8c>] (schedule_tail+0xc/0x5c)
[    7.079659] [<c0047d8c>] (schedule_tail+0xc/0x5c) from [<c000ee10>] (ret_from_fork+0x4/0x34)
[    7.088240] ---[ end trace 8c7837ef19036347 ]---
mounting filesystems...[    7.222365] BUG: scheduling while atomic: ksoftirqd/0/3/0x00000002
[    7.229222] INFO: lockdep is turned off.
[    7.233361] Modules linked in:
[    7.236502] CPU: 0 PID: 3 Comm: ksoftirqd/0 Tainted: G        W    3.10.0-rc6-00085-gf36ec4f #20
[    7.245509] [<c0013db4>] (unwind_backtrace+0x0/0xf0) from [<c0011b5c>] (show_stack+0x10/0x14)
[    7.254260] [<c0011b5c>] (show_stack+0x10/0x14) from [<c044d908>] (__schedule_bug+0x60/0x78)
[    7.262910] [<c044d908>] (__schedule_bug+0x60/0x78) from [<c0453fec>] (__schedule+0x554/0x624)
[    7.271735] [<c0453fec>] (__schedule+0x554/0x624) from [<c0044cd8>] (smpboot_thread_fn+0x104/0x208)
[    7.280966] [<c0044cd8>] (smpboot_thread_fn+0x104/0x208) from [<c003d5bc>] (kthread+0xa4/0xb0)
[    7.289655] [<c003d5bc>] (kthread+0xa4/0xb0) from [<c000ee20>] (ret_from_fork+0x14/0x34)
[    7.310837] BUG: scheduling while atomic: ksoftirqd/0/3/0x00000002
[    7.317065] INFO: lockdep is turned off.
[    7.321124] Modules linked in:
[    7.324253] CPU: 0 PID: 3 Comm: ksoftirqd/0 Tainted: G        W    3.10.0-rc6-00085-gf36ec4f #20
[    7.333251] [<c0013db4>] (unwind_backtrace+0x0/0xf0) from [<c0011b5c>] (show_stack+0x10/0x14)
[    7.342011] [<c0011b5c>] (show_stack+0x10/0x14) from [<c044d908>] (__schedule_bug+0x60/0x78)
[    7.350539] [<c044d908>] (__schedule_bug+0x60/0x78) from [<c0453fec>] (__schedule+0x554/0x624)
[    7.359342] [<c0453fec>] (__schedule+0x554/0x624) from [<c0044cd8>] (smpboot_thread_fn+0x104/0x208)
[    7.368564] [<c0044cd8>] (smpboot_thread_fn+0x104/0x208) from [<c003d5bc>] (kthread+0xa4/0xb0)
[    7.377364] [<c003d5bc>] (kthread+0xa4/0xb0) from [<c000ee20>] (ret_from_fork+0x14/0x34)
[    7.403225] BUG: scheduling while atomic: ksoftirqd/0/3/0x00000002
[    7.409454] INFO: lockdep is turned off.

..keeps going with "BUG: scheduling while atomic"

Marc
-- 
Pengutronix e.K.                  | Marc Kleine-Budde           |
Industrial Linux Solutions        | Phone: +49-231-2826-924     |
Vertretung West/Dortmund          | Fax:   +49-5121-206917-5555 |
Amtsgericht Hildesheim, HRA 2686  | http://www.pengutronix.de   |

-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 259 bytes
Desc: OpenPGP digital signature
URL: <http://lists.infradead.org/pipermail/linux-arm-kernel/attachments/20130620/ad1b0c54/attachment.sig>

^ permalink raw reply	[flat|nested] 20+ messages in thread

* BUG: commit "ARM: Remove __ARCH_WANT_INTERRUPTS_ON_CTXSW on pre-ARMv6 CPUs" breaks armv5 with CONFIG_PREEMPT
  2013-06-20 10:14   ` Marc Kleine-Budde
@ 2013-06-20 10:28     ` Catalin Marinas
  2013-06-20 11:12       ` Catalin Marinas
  0 siblings, 1 reply; 20+ messages in thread
From: Catalin Marinas @ 2013-06-20 10:28 UTC (permalink / raw)
  To: linux-arm-kernel

On Thu, Jun 20, 2013 at 11:14:07AM +0100, Marc Kleine-Budde wrote:
> On 06/20/2013 11:57 AM, Catalin Marinas wrote:
> >>> :040000 040000 034899bdcbc9aa59b5455a85a9d78b646b4cf784 ecc23e33a4ca807d4153f87fbea85a9437ff2928 M      arch
> >>
> >> The problem can be reproduced on several mx28 and an at91sam9263 and
> >> only occurs of CONFIG_PREEMPT (Preemptible Kernel (Low-Latency Desktop))
> >> is enabled.
> >>
> >> I have the gut feeling that the "if (irqs_disabled())" check in the
> >> above patch is not correct for CONFIG_PREEMPT.
> > 
> > The check is there to avoid long interrupt latencies (flushing the whole
> > cache with interrupts disabled during context switch). You can drop the
> > check and always call cpu_switch_mm() to confirm that it fixes the
> > faults.
> > 
> > finish_task_switch() calls finish_arch_post_lock_switch() after the
> > interrupts have been enabled so that the CPU can actually switch the mm.
> > I wonder whether we could actually be preempted after
> > finish_lock_switch() but before we actually switched the MMU.
> > 
> > Here's an untested patch (trying to keep it in the arch/arm code):
> 
> [...]
> 
> When stating userspace I get this reproducible:
> 
> init started: BusyBox v1.18.5 (2013-04-04 18:31:36 CEST)
> starting pid 50, tty '/dev/console': '/etc/init.d/rcS'
> [    7.000828] ------------[ cut here ]------------
> [    7.005560] WARNING: at kernel/sched/core.c:2809 finish_task_switch.constprop.84+0xf8/0x164()
> [    7.014650] DEBUG_LOCKS_WARN_ON(val > preempt_count())

We may need to place the preempt disable/enable at a higher level in the
scheduler. My theory is that we have a context switch from prev to next.
We get preempted just before finish_arch_post_lock_switch(), so the MMU
hasn't been switched yet. The new switch during preemption happens to a
thread with the same next mm, so the scheduler no longer switch_mm() and
the TIF_SWITCH_MM isn't set for the new thread.

I'll come back with another patch shortly.

-- 
Catalin

^ permalink raw reply	[flat|nested] 20+ messages in thread

* BUG: commit "ARM: Remove __ARCH_WANT_INTERRUPTS_ON_CTXSW on pre-ARMv6 CPUs" breaks armv5 with CONFIG_PREEMPT
  2013-06-20 10:28     ` Catalin Marinas
@ 2013-06-20 11:12       ` Catalin Marinas
  2013-06-20 11:35         ` Marc Kleine-Budde
  0 siblings, 1 reply; 20+ messages in thread
From: Catalin Marinas @ 2013-06-20 11:12 UTC (permalink / raw)
  To: linux-arm-kernel

On Thu, Jun 20, 2013 at 11:28:56AM +0100, Catalin Marinas wrote:
> We may need to place the preempt disable/enable at a higher level in the
> scheduler. My theory is that we have a context switch from prev to next.
> We get preempted just before finish_arch_post_lock_switch(), so the MMU
> hasn't been switched yet. The new switch during preemption happens to a
> thread with the same next mm, so the scheduler no longer switch_mm() and
> the TIF_SWITCH_MM isn't set for the new thread.
> 
> I'll come back with another patch shortly.

Here's another attempt (as before, only compile-tested):

--------------------8<-------------------------------

>From 02af0f498a9c4b80c196d52c44ab7700fab0f6db Mon Sep 17 00:00:00 2001
From: Catalin Marinas <catalin.marinas@arm.com>
Date: Thu, 20 Jun 2013 11:59:38 +0100
Subject: [PATCH] arm: Fix deferred mm switch on VIVT processors

As of commit b9d4d42ad9 (ARM: Remove __ARCH_WANT_INTERRUPTS_ON_CTXSW on
pre-ARMv6 CPUs), the mm switching on VIVT processors is done in the
finish_arch_post_lock_switch() function to avoid whole cache flushing
with interrupts disabled. The need for deferred mm switch is stored as a
thread flag (TIF_SWITCH_MM). However, with preemption enabled, we can
have another thread switch before finish_arch_post_lock_switch(). If the
new thread has the same mm as the previous 'next' thread, the scheduler
will not call switch_mm() and the TIF_SWITCH_MM flag won't be set for
the new thread.

This patch moves the switch pending flag to the mm_context_t structure
since this is specific to the mm rather than thread.

Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Reported-by: Marc Kleine-Budde <mkl@pengutronix.de>
---
 arch/arm/include/asm/mmu.h         | 2 ++
 arch/arm/include/asm/mmu_context.h | 8 +++++---
 arch/arm/include/asm/thread_info.h | 1 -
 3 files changed, 7 insertions(+), 4 deletions(-)

diff --git a/arch/arm/include/asm/mmu.h b/arch/arm/include/asm/mmu.h
index e3d5554..d1b4998 100644
--- a/arch/arm/include/asm/mmu.h
+++ b/arch/arm/include/asm/mmu.h
@@ -6,6 +6,8 @@
 typedef struct {
 #ifdef CONFIG_CPU_HAS_ASID
 	atomic64_t	id;
+#else
+	int		switch_pending;
 #endif
 	unsigned int	vmalloc_seq;
 } mm_context_t;
diff --git a/arch/arm/include/asm/mmu_context.h b/arch/arm/include/asm/mmu_context.h
index a7b85e0..4309ae5 100644
--- a/arch/arm/include/asm/mmu_context.h
+++ b/arch/arm/include/asm/mmu_context.h
@@ -47,7 +47,7 @@ static inline void check_and_switch_context(struct mm_struct *mm,
 		 * on non-ASID CPUs, the old mm will remain valid until the
 		 * finish_arch_post_lock_switch() call.
 		 */
-		set_ti_thread_flag(task_thread_info(tsk), TIF_SWITCH_MM);
+		mm->context.switch_pending = 1;
 	else
 		cpu_switch_mm(mm->pgd, mm);
 }
@@ -56,9 +56,11 @@ static inline void check_and_switch_context(struct mm_struct *mm,
 	finish_arch_post_lock_switch
 static inline void finish_arch_post_lock_switch(void)
 {
-	if (test_and_clear_thread_flag(TIF_SWITCH_MM)) {
-		struct mm_struct *mm = current->mm;
+	struct mm_struct *mm = current->mm;
+
+	if (mm->context.switch_pending) {
 		cpu_switch_mm(mm->pgd, mm);
+		mm->context.switch_pending = 0;
 	}
 }
 
diff --git a/arch/arm/include/asm/thread_info.h b/arch/arm/include/asm/thread_info.h
index 1995d1a..f00b569 100644
--- a/arch/arm/include/asm/thread_info.h
+++ b/arch/arm/include/asm/thread_info.h
@@ -156,7 +156,6 @@ extern int vfp_restore_user_hwstate(struct user_vfp __user *,
 #define TIF_USING_IWMMXT	17
 #define TIF_MEMDIE		18	/* is terminating due to OOM killer */
 #define TIF_RESTORE_SIGMASK	20
-#define TIF_SWITCH_MM		22	/* deferred switch_mm */
 
 #define _TIF_SIGPENDING		(1 << TIF_SIGPENDING)
 #define _TIF_NEED_RESCHED	(1 << TIF_NEED_RESCHED)

^ permalink raw reply related	[flat|nested] 20+ messages in thread

* BUG: commit "ARM: Remove __ARCH_WANT_INTERRUPTS_ON_CTXSW on pre-ARMv6 CPUs" breaks armv5 with CONFIG_PREEMPT
  2013-06-20 11:12       ` Catalin Marinas
@ 2013-06-20 11:35         ` Marc Kleine-Budde
  2013-06-20 11:39           ` Marc Kleine-Budde
  0 siblings, 1 reply; 20+ messages in thread
From: Marc Kleine-Budde @ 2013-06-20 11:35 UTC (permalink / raw)
  To: linux-arm-kernel

On 06/20/2013 01:12 PM, Catalin Marinas wrote:
> On Thu, Jun 20, 2013 at 11:28:56AM +0100, Catalin Marinas wrote:
>> We may need to place the preempt disable/enable at a higher level in the
>> scheduler. My theory is that we have a context switch from prev to next.
>> We get preempted just before finish_arch_post_lock_switch(), so the MMU
>> hasn't been switched yet. The new switch during preemption happens to a
>> thread with the same next mm, so the scheduler no longer switch_mm() and
>> the TIF_SWITCH_MM isn't set for the new thread.
>>
>> I'll come back with another patch shortly.
> 
> Here's another attempt (as before, only compile-tested):

booting kernel from /image
zImage: concatenated oftree detected
booting Linux kernel with devicetree

...dead...

Does every process have a "mm"? Even Kernel threads?

Marc
-- 
Pengutronix e.K.                  | Marc Kleine-Budde           |
Industrial Linux Solutions        | Phone: +49-231-2826-924     |
Vertretung West/Dortmund          | Fax:   +49-5121-206917-5555 |
Amtsgericht Hildesheim, HRA 2686  | http://www.pengutronix.de   |

-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 259 bytes
Desc: OpenPGP digital signature
URL: <http://lists.infradead.org/pipermail/linux-arm-kernel/attachments/20130620/26b07f5f/attachment.sig>

^ permalink raw reply	[flat|nested] 20+ messages in thread

* BUG: commit "ARM: Remove __ARCH_WANT_INTERRUPTS_ON_CTXSW on pre-ARMv6 CPUs" breaks armv5 with CONFIG_PREEMPT
  2013-06-20 11:35         ` Marc Kleine-Budde
@ 2013-06-20 11:39           ` Marc Kleine-Budde
  2013-06-20 11:47             ` Marc Kleine-Budde
  0 siblings, 1 reply; 20+ messages in thread
From: Marc Kleine-Budde @ 2013-06-20 11:39 UTC (permalink / raw)
  To: linux-arm-kernel

On 06/20/2013 01:35 PM, Marc Kleine-Budde wrote:
> On 06/20/2013 01:12 PM, Catalin Marinas wrote:
>> On Thu, Jun 20, 2013 at 11:28:56AM +0100, Catalin Marinas wrote:
>>> We may need to place the preempt disable/enable at a higher level in the
>>> scheduler. My theory is that we have a context switch from prev to next.
>>> We get preempted just before finish_arch_post_lock_switch(), so the MMU
>>> hasn't been switched yet. The new switch during preemption happens to a
>>> thread with the same next mm, so the scheduler no longer switch_mm() and
>>> the TIF_SWITCH_MM isn't set for the new thread.
>>>
>>> I'll come back with another patch shortly.
>>
>> Here's another attempt (as before, only compile-tested):
> 
> booting kernel from /image
> zImage: concatenated oftree detected
> booting Linux kernel with devicetree
> 
> ...dead...
> 
> Does every process have a "mm"? Even Kernel threads?

early printk gives us:

[    0.153207] CPU: Testing write buffer coherency: ok
[    0.161494] Unable to handle kernel NULL pointer dereference at virtual address 000001a4
[    0.170030] Unable to handle kernel NULL pointer dereference at virtual address 000001a4
[    0.178363] pgd = c0004000
[    0.181377] Unable to handle kernel NULL pointer dereference at virtual address 000001a4
[    0.189713] pgd = c0004000
[    0.192676] [000001a4] *pgd=00000000
[    0.196447] Internal error: Oops: 5 [#1] PREEMPT ARM
[    0.201594] Modules linked in:
[    0.204820] CPU: 0 PID: 2 Comm: swapper Not tainted 3.10.0-rc6-00085-g1c69299 #25
[    0.212524] task: c78408a0 ti: c7848000 task.ti: c7848000
[    0.218142] PC is at finish_task_switch.constprop.89+0x88/0x140
[    0.224279] LR is at _raw_spin_unlock_irq+0x38/0x58
[    0.229342] pc : [<c0047218>]    lr : [<c04554b8>]    psr: 60000053
[    0.229342] sp : c7849c58  ip : 600000d3  fp : c7849c84
[    0.241191] r10: 00000000  r9 : c0650088  r8 : c7840000
[    0.246599] r7 : c065e118  r6 : c7848000  r5 : 00000000  r4 : 00000000
[    0.253328] r3 : c78408a0  r2 : c7849c50  r1 : 00000001  r0 : 40000001
[    0.260059] Flags: nZCv  IRQs on  FIQs off  Mode SVC_32  ISA ARM  Segment kernel
[    0.267671] Control: 0005317f  Table: 40004000  DAC: 00000017
[    0.273607] Process swapper (pid: 2, stack limit = 0xc78481b8)
[    0.279631] Stack: (0xc7849c58 to 0xc784a000)
[    0.284163] 9c40:                                                       00000002 00000000
[    0.292587] 9c60: c00471d4 00000017 c78408a0 c7840000 c0658610 c7848000 c0658610 c065e118
[    0.301011] 9c80: c7849d14 c0453cbc 00000002 c7848000 c06ae110 c06ae310 c7849c98 c7849c98
[    0.309430] 9ca0: 0a217bcd 00000000 00000100 c002368c c78408a0 00000000 00000001 00000008
[    0.317852] 9cc0: c068a5c4 c0023894 c78408a0 c7804c00 00000100 00000004 00000000 c7848000
[    0.326274] 9ce0: c78408a0 c0454458 00000001 c000e9a4 c7848000 c7848000 20000053 ffffffff
[    0.334695] 9d00: c7849d5c c000e9a4 c7848000 00000006 c7849d24 c0454468 c78408a0 c001e2b0
[    0.343115] 9d20: 00000000 c000e9bc 00000001 00000001 00000000 c78408a0 0000004c c0654f08
[    0.351537] 9d40: c068c9e0 00000000 00000001 60000053 00000006 00000000 c78408a0 c7849d70
[    0.359956] 9d60: c005e134 c001e2b0 20000053 ffffffff 00000000 00000000 00000000 00000000
[    0.368378] 9d80: c068d2ca 0000004c 60000053 c0011f18 00000001 00000000 00000000 00000000
[    0.376795] 9da0: c0653df8 000001a4 00000000 00000005 c7849f28 000001a4 00000000 c78408a0
[    0.385216] 9dc0: 00000028 c044d514 c055b878 c7849de4 00000000 c7849de4 00000028 c044d2b0
[    0.393637] 9de0: c055b878 c055b84c 000001a4 c055b84c 00000000 c0016144 c06594f8 00000000
[    0.402053] 9e00: c7848000 c005b9ec 00000001 c7840ba0 00000000 00000000 c06578b4 c066a370
[    0.410474] 9e20: c7848000 00000000 00000001 c7840ba0 00000006 00000000 00086006 00000000
[    0.418895] 9e40: 60000093 c02575f0 c7848000 00000000 c7848000 c06578b4 00000000 00000000
[    0.427317] 9e60: 00000000 00000005 c001629c c06547b0 000001a4 c7849f28 00000000 00000000
[    0.435736] 9e80: c7849f9c c00085c4 c0658620 c066a3b8 c0659538 00000000 00000009 c005ba58
[    0.444156] 9ea0: 00000000 c06539e0 c7926000 c0650088 c06578b4 c0042a28 c7848000 00000000
[    0.452579] 9ec0: 00000001 c7840ba0 00000009 00000000 c78408a0 00000017 c7848000 c06585d0
[    0.460998] 9ee0: c065e0d8 c0042a58 00000000 c0650088 c065e0d8 c000ed14 40000013 00000000
[    0.469422] 9f00: c7848000 00000000 c7848000 c0658620 c0047218 40000053 ffffffff c7849f5c
[    0.477840] 9f20: c06528b8 c000e91c 00000000 00000001 c7849f68 c78408a0 00000000 00000000
[    0.486264] 9f40: c7848000 c065e118 c06528b8 00000000 00000000 c7849f9c 600000d3 c7849f70
[    0.494684] 9f60: c04554b8 c0047218 40000053 ffffffff 00000002 00000000 c00471d4 00000001
[    0.503105] 9f80: 00000000 c003da70 00000000 00000000 00000000 00000000 c7849fac c0047de0
[    0.511525] 9fa0: 00000000 c003da70 00000000 c000ee10 00000000 00000000 00000000 00000000
[    0.519943] 9fc0: 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
[    0.528361] 9fe0: 00000000 00000000 00000000 00000000 00000013 00000000 55575557 75557153
[    0.536811] [<c0047218>] (finish_task_switch.constprop.89+0x88/0x140) from [<c0453cbc>] (__schedule+0x204/0x600)
[    0.547275] [<c0453cbc>] (__schedule+0x204/0x600) from [<c0454468>] (preempt_schedule_irq+0x48/0x74)
[    0.556685] [<c0454468>] (preempt_schedule_irq+0x48/0x74) from [<c000e9bc>] (svc_preempt+0x8/0x18)
[    0.565914] [<c000e9bc>] (svc_preempt+0x8/0x18) from [<c001e2b0>] (vprintk_emit+0x190/0x53c)
[    0.574625] [<c001e2b0>] (vprintk_emit+0x190/0x53c) from [<c044d514>] (printk+0x30/0x40)
[    0.582975] [<c044d514>] (printk+0x30/0x40) from [<c044d2b0>] (__do_kernel_fault.part.9+0x38/0x74)
[    0.592208] [<c044d2b0>] (__do_kernel_fault.part.9+0x38/0x74) from [<c0016144>] (do_page_fault+0x2b8/0x388)
[    0.602224] [<c0016144>] (do_page_fault+0x2b8/0x388) from [<c00085c4>] (do_DataAbort+0x34/0x98)
[    0.611180] [<c00085c4>] (do_DataAbort+0x34/0x98) from [<c000e91c>] (__dabt_svc+0x3c/0x60)
[    0.619670] Exception stack(0xc7849f28 to 0xc7849f70)
[    0.624914] 9f20:                   00000000 00000001 c7849f68 c78408a0 00000000 00000000
[    0.633334] 9f40: c7848000 c065e118 c06528b8 00000000 00000000 c7849f9c 600000d3 c7849f70
[    0.641746] 9f60: c04554b8 c0047218 40000053 ffffffff
[    0.647009] [<c000e91c>] (__dabt_svc+0x3c/0x60) from [<c0047218>] (finish_task_switch.constprop.89+0x88/0x140)
[    0.657294] [<c0047218>] (finish_task_switch.constprop.89+0x88/0x140) from [<c0047de0>] (schedule_tail+0xc/0x5c)
[    0.667756] [<c0047de0>] (schedule_tail+0xc/0x5c) from [<c000ee10>] (ret_from_fork+0x4/0x34)
[    0.676442] Code: e59f00b0 eb10389b e596300c e59350e0 (e59531a4)
[    0.682941] ---[ end trace 1b75b31a2719ed1c ]---

Marc

-- 
Pengutronix e.K.                  | Marc Kleine-Budde           |
Industrial Linux Solutions        | Phone: +49-231-2826-924     |
Vertretung West/Dortmund          | Fax:   +49-5121-206917-5555 |
Amtsgericht Hildesheim, HRA 2686  | http://www.pengutronix.de   |

-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 259 bytes
Desc: OpenPGP digital signature
URL: <http://lists.infradead.org/pipermail/linux-arm-kernel/attachments/20130620/92b4f37c/attachment.sig>

^ permalink raw reply	[flat|nested] 20+ messages in thread

* BUG: commit "ARM: Remove __ARCH_WANT_INTERRUPTS_ON_CTXSW on pre-ARMv6 CPUs" breaks armv5 with CONFIG_PREEMPT
  2013-06-20 11:39           ` Marc Kleine-Budde
@ 2013-06-20 11:47             ` Marc Kleine-Budde
  2013-06-20 12:48               ` Marc Kleine-Budde
                                 ` (2 more replies)
  0 siblings, 3 replies; 20+ messages in thread
From: Marc Kleine-Budde @ 2013-06-20 11:47 UTC (permalink / raw)
  To: linux-arm-kernel

On 06/20/2013 01:39 PM, Marc Kleine-Budde wrote:
> On 06/20/2013 01:35 PM, Marc Kleine-Budde wrote:
>> On 06/20/2013 01:12 PM, Catalin Marinas wrote:
>>> On Thu, Jun 20, 2013 at 11:28:56AM +0100, Catalin Marinas wrote:
>>>> We may need to place the preempt disable/enable at a higher level in the
>>>> scheduler. My theory is that we have a context switch from prev to next.
>>>> We get preempted just before finish_arch_post_lock_switch(), so the MMU
>>>> hasn't been switched yet. The new switch during preemption happens to a
>>>> thread with the same next mm, so the scheduler no longer switch_mm() and
>>>> the TIF_SWITCH_MM isn't set for the new thread.
>>>>
>>>> I'll come back with another patch shortly.
>>>
>>> Here's another attempt (as before, only compile-tested):
>>
>> booting kernel from /image
>> zImage: concatenated oftree detected
>> booting Linux kernel with devicetree
>>
>> ...dead...
>>
>> Does every process have a "mm"? Even Kernel threads?

I've added a check for "mm". Boots now and my test runs stable for 3
minutes now.

I'm not sure if we have to check for "mm" in
check_and_switch_context(), too.

Thanks,
Marc
---

>From 306d84d5f0645a86e86d539be0546c4ac758d3d4 Mon Sep 17 00:00:00 2001
From: Catalin Marinas <catalin.marinas@arm.com>
Date: Thu, 20 Jun 2013 12:12:55 +0100
Subject: [PATCH] arm: Fix deferred mm switch on VIVT processors

As of commit b9d4d42ad9 (ARM: Remove __ARCH_WANT_INTERRUPTS_ON_CTXSW on
pre-ARMv6 CPUs), the mm switching on VIVT processors is done in the
finish_arch_post_lock_switch() function to avoid whole cache flushing
with interrupts disabled. The need for deferred mm switch is stored as a
thread flag (TIF_SWITCH_MM). However, with preemption enabled, we can
have another thread switch before finish_arch_post_lock_switch(). If the
new thread has the same mm as the previous 'next' thread, the scheduler
will not call switch_mm() and the TIF_SWITCH_MM flag won't be set for
the new thread.

This patch moves the switch pending flag to the mm_context_t structure
since this is specific to the mm rather than thread.

Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Reported-by: Marc Kleine-Budde <mkl@pengutronix.de>
[mkl: add check for mm]
Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
---
 arch/arm/include/asm/mmu.h         | 2 ++
 arch/arm/include/asm/mmu_context.h | 8 +++++---
 arch/arm/include/asm/thread_info.h | 1 -
 3 files changed, 7 insertions(+), 4 deletions(-)

diff --git a/arch/arm/include/asm/mmu.h b/arch/arm/include/asm/mmu.h
index e3d5554..d1b4998 100644
--- a/arch/arm/include/asm/mmu.h
+++ b/arch/arm/include/asm/mmu.h
@@ -6,6 +6,8 @@
 typedef struct {
 #ifdef CONFIG_CPU_HAS_ASID
 	atomic64_t	id;
+#else
+	int		switch_pending;
 #endif
 	unsigned int	vmalloc_seq;
 } mm_context_t;
diff --git a/arch/arm/include/asm/mmu_context.h b/arch/arm/include/asm/mmu_context.h
index a7b85e0..9503a7b 100644
--- a/arch/arm/include/asm/mmu_context.h
+++ b/arch/arm/include/asm/mmu_context.h
@@ -47,7 +47,7 @@ static inline void check_and_switch_context(struct mm_struct *mm,
 		 * on non-ASID CPUs, the old mm will remain valid until the
 		 * finish_arch_post_lock_switch() call.
 		 */
-		set_ti_thread_flag(task_thread_info(tsk), TIF_SWITCH_MM);
+		mm->context.switch_pending = 1;
 	else
 		cpu_switch_mm(mm->pgd, mm);
 }
@@ -56,9 +56,11 @@ static inline void check_and_switch_context(struct mm_struct *mm,
 	finish_arch_post_lock_switch
 static inline void finish_arch_post_lock_switch(void)
 {
-	if (test_and_clear_thread_flag(TIF_SWITCH_MM)) {
-		struct mm_struct *mm = current->mm;
+	struct mm_struct *mm = current->mm;
+
+	if (mm && mm->context.switch_pending) {
 		cpu_switch_mm(mm->pgd, mm);
+		mm->context.switch_pending = 0;
 	}
 }
 
diff --git a/arch/arm/include/asm/thread_info.h b/arch/arm/include/asm/thread_info.h
index 1995d1a..f00b569 100644
--- a/arch/arm/include/asm/thread_info.h
+++ b/arch/arm/include/asm/thread_info.h
@@ -156,7 +156,6 @@ extern int vfp_restore_user_hwstate(struct user_vfp __user *,
 #define TIF_USING_IWMMXT	17
 #define TIF_MEMDIE		18	/* is terminating due to OOM killer */
 #define TIF_RESTORE_SIGMASK	20
-#define TIF_SWITCH_MM		22	/* deferred switch_mm */
 
 #define _TIF_SIGPENDING		(1 << TIF_SIGPENDING)
 #define _TIF_NEED_RESCHED	(1 << TIF_NEED_RESCHED)

-- 
Pengutronix e.K.                  | Marc Kleine-Budde           |
Industrial Linux Solutions        | Phone: +49-231-2826-924     |
Vertretung West/Dortmund          | Fax:   +49-5121-206917-5555 |
Amtsgericht Hildesheim, HRA 2686  | http://www.pengutronix.de   |

-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 259 bytes
Desc: OpenPGP digital signature
URL: <http://lists.infradead.org/pipermail/linux-arm-kernel/attachments/20130620/d76998d0/attachment-0001.sig>

^ permalink raw reply related	[flat|nested] 20+ messages in thread

* BUG: commit "ARM: Remove __ARCH_WANT_INTERRUPTS_ON_CTXSW on pre-ARMv6 CPUs" breaks armv5 with CONFIG_PREEMPT
  2013-06-20 11:47             ` Marc Kleine-Budde
@ 2013-06-20 12:48               ` Marc Kleine-Budde
  2013-06-20 13:01               ` Catalin Marinas
  2013-06-21 10:28               ` Marc Kleine-Budde
  2 siblings, 0 replies; 20+ messages in thread
From: Marc Kleine-Budde @ 2013-06-20 12:48 UTC (permalink / raw)
  To: linux-arm-kernel

On 06/20/2013 01:47 PM, Marc Kleine-Budde wrote:
> On 06/20/2013 01:39 PM, Marc Kleine-Budde wrote:
>> On 06/20/2013 01:35 PM, Marc Kleine-Budde wrote:
>>> On 06/20/2013 01:12 PM, Catalin Marinas wrote:
>>>> On Thu, Jun 20, 2013 at 11:28:56AM +0100, Catalin Marinas wrote:
>>>>> We may need to place the preempt disable/enable at a higher level in the
>>>>> scheduler. My theory is that we have a context switch from prev to next.
>>>>> We get preempted just before finish_arch_post_lock_switch(), so the MMU
>>>>> hasn't been switched yet. The new switch during preemption happens to a
>>>>> thread with the same next mm, so the scheduler no longer switch_mm() and
>>>>> the TIF_SWITCH_MM isn't set for the new thread.
>>>>>
>>>>> I'll come back with another patch shortly.
>>>>
>>>> Here's another attempt (as before, only compile-tested):
>>>
>>> booting kernel from /image
>>> zImage: concatenated oftree detected
>>> booting Linux kernel with devicetree
>>>
>>> ...dead...
>>>
>>> Does every process have a "mm"? Even Kernel threads?
> 
> I've added a check for "mm". Boots now and my test runs stable for 3
> minutes now.
> 
> I'm not sure if we have to check for "mm" in
> check_and_switch_context(), too.
> 
> Thanks,
> Marc
> ---
> 
> From 306d84d5f0645a86e86d539be0546c4ac758d3d4 Mon Sep 17 00:00:00 2001
> From: Catalin Marinas <catalin.marinas@arm.com>
> Date: Thu, 20 Jun 2013 12:12:55 +0100
> Subject: [PATCH] arm: Fix deferred mm switch on VIVT processors
> 
> As of commit b9d4d42ad9 (ARM: Remove __ARCH_WANT_INTERRUPTS_ON_CTXSW on
> pre-ARMv6 CPUs), the mm switching on VIVT processors is done in the
> finish_arch_post_lock_switch() function to avoid whole cache flushing
> with interrupts disabled. The need for deferred mm switch is stored as a
> thread flag (TIF_SWITCH_MM). However, with preemption enabled, we can
> have another thread switch before finish_arch_post_lock_switch(). If the
> new thread has the same mm as the previous 'next' thread, the scheduler
> will not call switch_mm() and the TIF_SWITCH_MM flag won't be set for
> the new thread.
> 
> This patch moves the switch pending flag to the mm_context_t structure
> since this is specific to the mm rather than thread.
> 
> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
> Reported-by: Marc Kleine-Budde <mkl@pengutronix.de>
> [mkl: add check for mm]
> Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>

The test program works for 1 hour without problems. I'm going to push
the patch to our customer, if it fixes the problem there,  I'll add my
Tested-by.

regards,
Marc
-- 
Pengutronix e.K.                  | Marc Kleine-Budde           |
Industrial Linux Solutions        | Phone: +49-231-2826-924     |
Vertretung West/Dortmund          | Fax:   +49-5121-206917-5555 |
Amtsgericht Hildesheim, HRA 2686  | http://www.pengutronix.de   |

-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 259 bytes
Desc: OpenPGP digital signature
URL: <http://lists.infradead.org/pipermail/linux-arm-kernel/attachments/20130620/8f6986d1/attachment.sig>

^ permalink raw reply	[flat|nested] 20+ messages in thread

* BUG: commit "ARM: Remove __ARCH_WANT_INTERRUPTS_ON_CTXSW on pre-ARMv6 CPUs" breaks armv5 with CONFIG_PREEMPT
  2013-06-20 11:47             ` Marc Kleine-Budde
  2013-06-20 12:48               ` Marc Kleine-Budde
@ 2013-06-20 13:01               ` Catalin Marinas
  2013-06-20 13:05                 ` Marc Kleine-Budde
  2013-06-21 10:28               ` Marc Kleine-Budde
  2 siblings, 1 reply; 20+ messages in thread
From: Catalin Marinas @ 2013-06-20 13:01 UTC (permalink / raw)
  To: linux-arm-kernel

On Thu, Jun 20, 2013 at 12:47:25PM +0100, Marc Kleine-Budde wrote:
> On 06/20/2013 01:39 PM, Marc Kleine-Budde wrote:
> > On 06/20/2013 01:35 PM, Marc Kleine-Budde wrote:
> >> On 06/20/2013 01:12 PM, Catalin Marinas wrote:
> >>> On Thu, Jun 20, 2013 at 11:28:56AM +0100, Catalin Marinas wrote:
> >>>> We may need to place the preempt disable/enable at a higher level in the
> >>>> scheduler. My theory is that we have a context switch from prev to next.
> >>>> We get preempted just before finish_arch_post_lock_switch(), so the MMU
> >>>> hasn't been switched yet. The new switch during preemption happens to a
> >>>> thread with the same next mm, so the scheduler no longer switch_mm() and
> >>>> the TIF_SWITCH_MM isn't set for the new thread.
> >>>>
> >>>> I'll come back with another patch shortly.
> >>>
> >>> Here's another attempt (as before, only compile-tested):
> >>
> >> booting kernel from /image
> >> zImage: concatenated oftree detected
> >> booting Linux kernel with devicetree
> >>
> >> ...dead...
> >>
> >> Does every process have a "mm"? Even Kernel threads?
> 
> I've added a check for "mm". Boots now and my test runs stable for 3
> minutes now.

Ah, good point.

> I'm not sure if we have to check for "mm" in
> check_and_switch_context(), too.

switch_mm() wouldn't be called with a NULL mm, hence we wouldn't call
check_and_switch_context() either. finish_arch_post_lock_switch() is
called all the time (and we set the TIF flag only if switch_mm() was
called).

Thanks.

-- 
Catalin

^ permalink raw reply	[flat|nested] 20+ messages in thread

* BUG: commit "ARM: Remove __ARCH_WANT_INTERRUPTS_ON_CTXSW on pre-ARMv6 CPUs" breaks armv5 with CONFIG_PREEMPT
  2013-06-20 13:01               ` Catalin Marinas
@ 2013-06-20 13:05                 ` Marc Kleine-Budde
  0 siblings, 0 replies; 20+ messages in thread
From: Marc Kleine-Budde @ 2013-06-20 13:05 UTC (permalink / raw)
  To: linux-arm-kernel

On 06/20/2013 03:01 PM, Catalin Marinas wrote:
> On Thu, Jun 20, 2013 at 12:47:25PM +0100, Marc Kleine-Budde wrote:
>> On 06/20/2013 01:39 PM, Marc Kleine-Budde wrote:
>>> On 06/20/2013 01:35 PM, Marc Kleine-Budde wrote:
>>>> On 06/20/2013 01:12 PM, Catalin Marinas wrote:
>>>>> On Thu, Jun 20, 2013 at 11:28:56AM +0100, Catalin Marinas wrote:
>>>>>> We may need to place the preempt disable/enable at a higher level in the
>>>>>> scheduler. My theory is that we have a context switch from prev to next.
>>>>>> We get preempted just before finish_arch_post_lock_switch(), so the MMU
>>>>>> hasn't been switched yet. The new switch during preemption happens to a
>>>>>> thread with the same next mm, so the scheduler no longer switch_mm() and
>>>>>> the TIF_SWITCH_MM isn't set for the new thread.
>>>>>>
>>>>>> I'll come back with another patch shortly.
>>>>>
>>>>> Here's another attempt (as before, only compile-tested):
>>>>
>>>> booting kernel from /image
>>>> zImage: concatenated oftree detected
>>>> booting Linux kernel with devicetree
>>>>
>>>> ...dead...
>>>>
>>>> Does every process have a "mm"? Even Kernel threads?
>>
>> I've added a check for "mm". Boots now and my test runs stable for 3
>> minutes now.
> 
> Ah, good point.
> 
>> I'm not sure if we have to check for "mm" in
>> check_and_switch_context(), too.
> 
> switch_mm() wouldn't be called with a NULL mm, hence we wouldn't call
> check_and_switch_context() either. finish_arch_post_lock_switch() is
> called all the time (and we set the TIF flag only if switch_mm() was
> called).

Thanks for the explanation. $CUSTOMER's mx28 runs stable with the
minimal test program, I'll keep you informed if they have tested with
the full blown program.

Marc

-- 
Pengutronix e.K.                  | Marc Kleine-Budde           |
Industrial Linux Solutions        | Phone: +49-231-2826-924     |
Vertretung West/Dortmund          | Fax:   +49-5121-206917-5555 |
Amtsgericht Hildesheim, HRA 2686  | http://www.pengutronix.de   |

-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 259 bytes
Desc: OpenPGP digital signature
URL: <http://lists.infradead.org/pipermail/linux-arm-kernel/attachments/20130620/0ddc9ae4/attachment.sig>

^ permalink raw reply	[flat|nested] 20+ messages in thread

* BUG: commit "ARM: Remove __ARCH_WANT_INTERRUPTS_ON_CTXSW on pre-ARMv6 CPUs" breaks armv5 with CONFIG_PREEMPT
  2013-06-20 11:47             ` Marc Kleine-Budde
  2013-06-20 12:48               ` Marc Kleine-Budde
  2013-06-20 13:01               ` Catalin Marinas
@ 2013-06-21 10:28               ` Marc Kleine-Budde
  2013-06-21 13:52                 ` Catalin Marinas
  2 siblings, 1 reply; 20+ messages in thread
From: Marc Kleine-Budde @ 2013-06-21 10:28 UTC (permalink / raw)
  To: linux-arm-kernel

On 06/20/2013 01:47 PM, Marc Kleine-Budde wrote:
> On 06/20/2013 01:39 PM, Marc Kleine-Budde wrote:
>> On 06/20/2013 01:35 PM, Marc Kleine-Budde wrote:
>>> On 06/20/2013 01:12 PM, Catalin Marinas wrote:
>>>> On Thu, Jun 20, 2013 at 11:28:56AM +0100, Catalin Marinas wrote:
>>>>> We may need to place the preempt disable/enable at a higher level in the
>>>>> scheduler. My theory is that we have a context switch from prev to next.
>>>>> We get preempted just before finish_arch_post_lock_switch(), so the MMU
>>>>> hasn't been switched yet. The new switch during preemption happens to a
>>>>> thread with the same next mm, so the scheduler no longer switch_mm() and
>>>>> the TIF_SWITCH_MM isn't set for the new thread.
>>>>>
>>>>> I'll come back with another patch shortly.
>>>>
>>>> Here's another attempt (as before, only compile-tested):
>>>
>>> booting kernel from /image
>>> zImage: concatenated oftree detected
>>> booting Linux kernel with devicetree
>>>
>>> ...dead...
>>>
>>> Does every process have a "mm"? Even Kernel threads?
> 
> I've added a check for "mm". Boots now and my test runs stable for 3
> minutes now.
> 
> I'm not sure if we have to check for "mm" in
> check_and_switch_context(), too.
> 
> Thanks,
> Marc
> ---
> 
> From 306d84d5f0645a86e86d539be0546c4ac758d3d4 Mon Sep 17 00:00:00 2001
> From: Catalin Marinas <catalin.marinas@arm.com>
> Date: Thu, 20 Jun 2013 12:12:55 +0100
> Subject: [PATCH] arm: Fix deferred mm switch on VIVT processors
> 
> As of commit b9d4d42ad9 (ARM: Remove __ARCH_WANT_INTERRUPTS_ON_CTXSW on
> pre-ARMv6 CPUs), the mm switching on VIVT processors is done in the
> finish_arch_post_lock_switch() function to avoid whole cache flushing
> with interrupts disabled. The need for deferred mm switch is stored as a
> thread flag (TIF_SWITCH_MM). However, with preemption enabled, we can
> have another thread switch before finish_arch_post_lock_switch(). If the
> new thread has the same mm as the previous 'next' thread, the scheduler
> will not call switch_mm() and the TIF_SWITCH_MM flag won't be set for
> the new thread.
> 
> This patch moves the switch pending flag to the mm_context_t structure
> since this is specific to the mm rather than thread.
> 
> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
> Reported-by: Marc Kleine-Budde <mkl@pengutronix.de>
> [mkl: add check for mm]
> Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>

Works on $CUSTOMERS's hardware.

Tested-by: Marc Kleine-Budde <mkl@pengutronix.de>

Catalin, do you take care of the patch? Please add stable on Cc. The fix
is needed on kernels >= v3.5.

regards,
Marc

-- 
Pengutronix e.K.                  | Marc Kleine-Budde           |
Industrial Linux Solutions        | Phone: +49-231-2826-924     |
Vertretung West/Dortmund          | Fax:   +49-5121-206917-5555 |
Amtsgericht Hildesheim, HRA 2686  | http://www.pengutronix.de   |

-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 259 bytes
Desc: OpenPGP digital signature
URL: <http://lists.infradead.org/pipermail/linux-arm-kernel/attachments/20130621/1f4e3652/attachment.sig>

^ permalink raw reply	[flat|nested] 20+ messages in thread

* BUG: commit "ARM: Remove __ARCH_WANT_INTERRUPTS_ON_CTXSW on pre-ARMv6 CPUs" breaks armv5 with CONFIG_PREEMPT
  2013-06-21 10:28               ` Marc Kleine-Budde
@ 2013-06-21 13:52                 ` Catalin Marinas
  2013-07-17  8:41                   ` Marc Kleine-Budde
  0 siblings, 1 reply; 20+ messages in thread
From: Catalin Marinas @ 2013-06-21 13:52 UTC (permalink / raw)
  To: linux-arm-kernel

On Fri, Jun 21, 2013 at 11:28:48AM +0100, Marc Kleine-Budde wrote:
> On 06/20/2013 01:47 PM, Marc Kleine-Budde wrote:
> > From 306d84d5f0645a86e86d539be0546c4ac758d3d4 Mon Sep 17 00:00:00 2001
> > From: Catalin Marinas <catalin.marinas@arm.com>
> > Date: Thu, 20 Jun 2013 12:12:55 +0100
> > Subject: [PATCH] arm: Fix deferred mm switch on VIVT processors
> > 
> > As of commit b9d4d42ad9 (ARM: Remove __ARCH_WANT_INTERRUPTS_ON_CTXSW on
> > pre-ARMv6 CPUs), the mm switching on VIVT processors is done in the
> > finish_arch_post_lock_switch() function to avoid whole cache flushing
> > with interrupts disabled. The need for deferred mm switch is stored as a
> > thread flag (TIF_SWITCH_MM). However, with preemption enabled, we can
> > have another thread switch before finish_arch_post_lock_switch(). If the
> > new thread has the same mm as the previous 'next' thread, the scheduler
> > will not call switch_mm() and the TIF_SWITCH_MM flag won't be set for
> > the new thread.
> > 
> > This patch moves the switch pending flag to the mm_context_t structure
> > since this is specific to the mm rather than thread.
> > 
> > Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
> > Reported-by: Marc Kleine-Budde <mkl@pengutronix.de>
> > [mkl: add check for mm]
> > Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
> 
> Works on $CUSTOMERS's hardware.
> 
> Tested-by: Marc Kleine-Budde <mkl@pengutronix.de>

Thanks.

> Catalin, do you take care of the patch? Please add stable on Cc. The fix
> is needed on kernels >= v3.5.

I'll send it to Russell, if possible it should go in for 3.10. I already
added CC stable in my copy (3.5+).

-- 
Catalin

^ permalink raw reply	[flat|nested] 20+ messages in thread

* BUG: commit "ARM: Remove __ARCH_WANT_INTERRUPTS_ON_CTXSW on pre-ARMv6 CPUs" breaks armv5 with CONFIG_PREEMPT
  2013-06-21 13:52                 ` Catalin Marinas
@ 2013-07-17  8:41                   ` Marc Kleine-Budde
  2013-07-17  8:51                     ` Russell King - ARM Linux
  0 siblings, 1 reply; 20+ messages in thread
From: Marc Kleine-Budde @ 2013-07-17  8:41 UTC (permalink / raw)
  To: linux-arm-kernel

On 06/21/2013 03:52 PM, Catalin Marinas wrote:
> On Fri, Jun 21, 2013 at 11:28:48AM +0100, Marc Kleine-Budde wrote:
>> On 06/20/2013 01:47 PM, Marc Kleine-Budde wrote:
>>> From 306d84d5f0645a86e86d539be0546c4ac758d3d4 Mon Sep 17 00:00:00 2001
>>> From: Catalin Marinas <catalin.marinas@arm.com>
>>> Date: Thu, 20 Jun 2013 12:12:55 +0100
>>> Subject: [PATCH] arm: Fix deferred mm switch on VIVT processors
>>>
>>> As of commit b9d4d42ad9 (ARM: Remove __ARCH_WANT_INTERRUPTS_ON_CTXSW on
>>> pre-ARMv6 CPUs), the mm switching on VIVT processors is done in the
>>> finish_arch_post_lock_switch() function to avoid whole cache flushing
>>> with interrupts disabled. The need for deferred mm switch is stored as a
>>> thread flag (TIF_SWITCH_MM). However, with preemption enabled, we can
>>> have another thread switch before finish_arch_post_lock_switch(). If the
>>> new thread has the same mm as the previous 'next' thread, the scheduler
>>> will not call switch_mm() and the TIF_SWITCH_MM flag won't be set for
>>> the new thread.
>>>
>>> This patch moves the switch pending flag to the mm_context_t structure
>>> since this is specific to the mm rather than thread.
>>>
>>> Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
>>> Reported-by: Marc Kleine-Budde <mkl@pengutronix.de>
>>> [mkl: add check for mm]
>>> Signed-off-by: Marc Kleine-Budde <mkl@pengutronix.de>
>>
>> Works on $CUSTOMERS's hardware.
>>
>> Tested-by: Marc Kleine-Budde <mkl@pengutronix.de>
> 
> Thanks.
> 
>> Catalin, do you take care of the patch? Please add stable on Cc. The fix
>> is needed on kernels >= v3.5.
> 
> I'll send it to Russell, if possible it should go in for 3.10. I already
> added CC stable in my copy (3.5+).

What happend to that patch? I cannot find it in v3.10.x nor in linus/master.

Marc

-- 
Pengutronix e.K.                  | Marc Kleine-Budde           |
Industrial Linux Solutions        | Phone: +49-231-2826-924     |
Vertretung West/Dortmund          | Fax:   +49-5121-206917-5555 |
Amtsgericht Hildesheim, HRA 2686  | http://www.pengutronix.de   |

-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 259 bytes
Desc: OpenPGP digital signature
URL: <http://lists.infradead.org/pipermail/linux-arm-kernel/attachments/20130717/dcf37d41/attachment.sig>

^ permalink raw reply	[flat|nested] 20+ messages in thread

* BUG: commit "ARM: Remove __ARCH_WANT_INTERRUPTS_ON_CTXSW on pre-ARMv6 CPUs" breaks armv5 with CONFIG_PREEMPT
  2013-07-17  8:41                   ` Marc Kleine-Budde
@ 2013-07-17  8:51                     ` Russell King - ARM Linux
  2013-07-17 11:48                       ` Marc Kleine-Budde
  2013-07-17 19:41                       ` Catalin Marinas
  0 siblings, 2 replies; 20+ messages in thread
From: Russell King - ARM Linux @ 2013-07-17  8:51 UTC (permalink / raw)
  To: linux-arm-kernel

On Wed, Jul 17, 2013 at 10:41:19AM +0200, Marc Kleine-Budde wrote:
> What happend to that patch? I cannot find it in v3.10.x nor in linus/master.

Catalin sent it to me, and I didn't apply it because it's still racy.

We have stateful context switches (due to cache clearing) and this does
nothing to solve the problem there.  If we get preempted during the
cache clearing, things will still go wrong.

See:

http://www.arm.linux.org.uk/developer/patches/viewpatch.php?id=7771/1

I've no idea what's happening about this.  It looks to me like nothing
happened after Catalin's last comment there.

^ permalink raw reply	[flat|nested] 20+ messages in thread

* BUG: commit "ARM: Remove __ARCH_WANT_INTERRUPTS_ON_CTXSW on pre-ARMv6 CPUs" breaks armv5 with CONFIG_PREEMPT
  2013-07-17  8:51                     ` Russell King - ARM Linux
@ 2013-07-17 11:48                       ` Marc Kleine-Budde
  2013-07-17 19:41                       ` Catalin Marinas
  1 sibling, 0 replies; 20+ messages in thread
From: Marc Kleine-Budde @ 2013-07-17 11:48 UTC (permalink / raw)
  To: linux-arm-kernel

On 07/17/2013 10:51 AM, Russell King - ARM Linux wrote:
> On Wed, Jul 17, 2013 at 10:41:19AM +0200, Marc Kleine-Budde wrote:
>> What happend to that patch? I cannot find it in v3.10.x nor in linus/master.
> 
> Catalin sent it to me, and I didn't apply it because it's still racy.
> 
> We have stateful context switches (due to cache clearing) and this does
> nothing to solve the problem there.  If we get preempted during the
> cache clearing, things will still go wrong.

Thanks for the update.

> See:
> 
> http://www.arm.linux.org.uk/developer/patches/viewpatch.php?id=7771/1
> 
> I've no idea what's happening about this.  It looks to me like nothing
> happened after Catalin's last comment there.

Catalin, what's next?

Marc

-- 
Pengutronix e.K.                  | Marc Kleine-Budde           |
Industrial Linux Solutions        | Phone: +49-231-2826-924     |
Vertretung West/Dortmund          | Fax:   +49-5121-206917-5555 |
Amtsgericht Hildesheim, HRA 2686  | http://www.pengutronix.de   |

-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 259 bytes
Desc: OpenPGP digital signature
URL: <http://lists.infradead.org/pipermail/linux-arm-kernel/attachments/20130717/e01fe949/attachment.sig>

^ permalink raw reply	[flat|nested] 20+ messages in thread

* BUG: commit "ARM: Remove __ARCH_WANT_INTERRUPTS_ON_CTXSW on pre-ARMv6 CPUs" breaks armv5 with CONFIG_PREEMPT
  2013-07-17  8:51                     ` Russell King - ARM Linux
  2013-07-17 11:48                       ` Marc Kleine-Budde
@ 2013-07-17 19:41                       ` Catalin Marinas
  1 sibling, 0 replies; 20+ messages in thread
From: Catalin Marinas @ 2013-07-17 19:41 UTC (permalink / raw)
  To: linux-arm-kernel

On Wed, Jul 17, 2013 at 09:51:39AM +0100, Russell King - ARM Linux wrote:
> On Wed, Jul 17, 2013 at 10:41:19AM +0200, Marc Kleine-Budde wrote:
> > What happend to that patch? I cannot find it in v3.10.x nor in linus/master.
> 
> Catalin sent it to me, and I didn't apply it because it's still racy.
> 
> We have stateful context switches (due to cache clearing) and this does
> nothing to solve the problem there.  If we get preempted during the
> cache clearing, things will still go wrong.
> 
> See:
> 
> http://www.arm.linux.org.uk/developer/patches/viewpatch.php?id=7771/1
> 
> I've no idea what's happening about this.  It looks to me like nothing
> happened after Catalin's last comment there.

Too many other things to do. It's on my list for this week or early next
week.

-- 
Catalin

^ permalink raw reply	[flat|nested] 20+ messages in thread

end of thread, other threads:[~2013-07-17 19:41 UTC | newest]

Thread overview: 20+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2013-06-20  8:43 BUG: commit "ARM: Remove __ARCH_WANT_INTERRUPTS_ON_CTXSW on pre-ARMv6 CPUs" breaks armv5 with CONFIG_PREEMPT Marc Kleine-Budde
2013-06-20  9:25 ` Will Deacon
2013-06-20  9:51   ` Marc Kleine-Budde
2013-06-20  9:57 ` Catalin Marinas
2013-06-20 10:08   ` Marc Kleine-Budde
2013-06-20 10:14   ` Marc Kleine-Budde
2013-06-20 10:28     ` Catalin Marinas
2013-06-20 11:12       ` Catalin Marinas
2013-06-20 11:35         ` Marc Kleine-Budde
2013-06-20 11:39           ` Marc Kleine-Budde
2013-06-20 11:47             ` Marc Kleine-Budde
2013-06-20 12:48               ` Marc Kleine-Budde
2013-06-20 13:01               ` Catalin Marinas
2013-06-20 13:05                 ` Marc Kleine-Budde
2013-06-21 10:28               ` Marc Kleine-Budde
2013-06-21 13:52                 ` Catalin Marinas
2013-07-17  8:41                   ` Marc Kleine-Budde
2013-07-17  8:51                     ` Russell King - ARM Linux
2013-07-17 11:48                       ` Marc Kleine-Budde
2013-07-17 19:41                       ` Catalin Marinas

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.