From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S965288AbcAUNXN (ORCPT ); Thu, 21 Jan 2016 08:23:13 -0500 Received: from mail-io0-f196.google.com ([209.85.223.196]:33104 "EHLO mail-io0-f196.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S965107AbcAUNW6 (ORCPT ); Thu, 21 Jan 2016 08:22:58 -0500 MIME-Version: 1.0 Date: Thu, 21 Jan 2016 14:22:56 +0100 X-Google-Sender-Auth: dTb4XonvwnHXLEcKPgFOoru718Y Message-ID: Subject: RCU lockup? (was: Re: [PATCH v2 tip/core/rcu 10/14] rcu: Don't redundantly disable irqs in rcu_irq_{enter,exit}()) From: Geert Uytterhoeven To: "Paul E. McKenney" Cc: "linux-kernel@vger.kernel.org" , Ingo Molnar , jiangshanlai@gmail.com, dipankar@in.ibm.com, Andrew Morton , Mathieu Desnoyers , Josh Triplett , Thomas Gleixner , Peter Zijlstra , Steven Rostedt , David Howells , Eric Dumazet , Darren Hart , =?UTF-8?B?RnLDqWTDqXJpYyBXZWlzYmVja2Vy?= , Oleg Nesterov , pranith kumar , "linux-arm-kernel@lists.infradead.org" , linux-renesas-soc@vger.kernel.org Content-Type: text/plain; charset=UTF-8 Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Hi Paul, On Thu, Dec 10, 2015 at 12:10 AM, Paul E. McKenney wrote: > This commit replaces a local_irq_save()/local_irq_restore() pair with > a lockdep assertion that interrupts are already disabled. This should > remove the corresponding overhead from the interrupt entry/exit fastpaths. > > This change was inspired by the fact that Iftekhar Ahmed's mutation > testing showed that removing rcu_irq_enter()'s call to local_ird_restore() > had no effect, which might indicate that interrupts were always enabled > anyway. > > Signed-off-by: Paul E. McKenney > --- > include/linux/rcupdate.h | 4 ++-- > include/linux/rcutiny.h | 8 ++++++++ > include/linux/rcutree.h | 2 ++ > include/linux/tracepoint.h | 4 ++-- > kernel/rcu/tree.c | 32 ++++++++++++++++++++++++++------ > 5 files changed, 40 insertions(+), 10 deletions(-) This commit (7c9906ca5e582a773fff696975e312cef58a7386) is triggering lock ups during boot on r8a7791/koelsch (dual Cortex A15). Probably this commit does not contain the real bug, but a symptom. Unfortunately I cannot reproduce it with CONFIG_PROVE_RCU=y. I started seeing the issue when disabling an innocent option in shmobile_defconfig. I tracked it down to the removal of an unused C function, containing hardware support for another system. Replacing the C function by a dummy function with the right number of "asm("nop")"s (depending on kernel version and/or kernel config, sigh) made the issue go away. Adding or removing nops makes the issue reappear, and has some impact on how early the issue happens (sometimes as late as early userspace). Adding a multiple of 16 nops has no impact. So it looks like something that should be cacheline-aligned isn't... CONFIG_TREE_RCU=y Do you have a suggestion? Thanks! Gr{oetje,eeting}s, Geert -- Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- geert@linux-m68k.org In personal conversations with technical people, I call myself a hacker. But when I'm talking to journalists I just say "programmer" or something like that. -- Linus Torvalds From mboxrd@z Thu Jan 1 00:00:00 1970 From: geert@linux-m68k.org (Geert Uytterhoeven) Date: Thu, 21 Jan 2016 14:22:56 +0100 Subject: RCU lockup? (was: Re: [PATCH v2 tip/core/rcu 10/14] rcu: Don't redundantly disable irqs in rcu_irq_{enter,exit}()) Message-ID: To: linux-arm-kernel@lists.infradead.org List-Id: linux-arm-kernel.lists.infradead.org Hi Paul, On Thu, Dec 10, 2015 at 12:10 AM, Paul E. McKenney wrote: > This commit replaces a local_irq_save()/local_irq_restore() pair with > a lockdep assertion that interrupts are already disabled. This should > remove the corresponding overhead from the interrupt entry/exit fastpaths. > > This change was inspired by the fact that Iftekhar Ahmed's mutation > testing showed that removing rcu_irq_enter()'s call to local_ird_restore() > had no effect, which might indicate that interrupts were always enabled > anyway. > > Signed-off-by: Paul E. McKenney > --- > include/linux/rcupdate.h | 4 ++-- > include/linux/rcutiny.h | 8 ++++++++ > include/linux/rcutree.h | 2 ++ > include/linux/tracepoint.h | 4 ++-- > kernel/rcu/tree.c | 32 ++++++++++++++++++++++++++------ > 5 files changed, 40 insertions(+), 10 deletions(-) This commit (7c9906ca5e582a773fff696975e312cef58a7386) is triggering lock ups during boot on r8a7791/koelsch (dual Cortex A15). Probably this commit does not contain the real bug, but a symptom. Unfortunately I cannot reproduce it with CONFIG_PROVE_RCU=y. I started seeing the issue when disabling an innocent option in shmobile_defconfig. I tracked it down to the removal of an unused C function, containing hardware support for another system. Replacing the C function by a dummy function with the right number of "asm("nop")"s (depending on kernel version and/or kernel config, sigh) made the issue go away. Adding or removing nops makes the issue reappear, and has some impact on how early the issue happens (sometimes as late as early userspace). Adding a multiple of 16 nops has no impact. So it looks like something that should be cacheline-aligned isn't... CONFIG_TREE_RCU=y Do you have a suggestion? Thanks! Gr{oetje,eeting}s, Geert -- Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- geert at linux-m68k.org In personal conversations with technical people, I call myself a hacker. But when I'm talking to journalists I just say "programmer" or something like that. -- Linus Torvalds