From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S932458Ab3B0R7z (ORCPT ); Wed, 27 Feb 2013 12:59:55 -0500 Received: from e7.ny.us.ibm.com ([32.97.182.137]:52355 "EHLO e7.ny.us.ibm.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1760238Ab3B0R7x (ORCPT ); Wed, 27 Feb 2013 12:59:53 -0500 Date: Wed, 27 Feb 2013 09:59:45 -0800 From: "Paul E. McKenney" To: Jan Kara Cc: Andrew Morton , LKML Subject: Re: [PATCH v3] printk: Avoid softlockups in console_unlock() Message-ID: <20130227175945.GC3336@linux.vnet.ibm.com> Reply-To: paulmck@linux.vnet.ibm.com References: <1361927819-1570-1-git-send-email-jack@suse.cz> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <1361927819-1570-1-git-send-email-jack@suse.cz> User-Agent: Mutt/1.5.21 (2010-09-15) X-Content-Scanned: Fidelis XPS MAILER x-cbid: 13022717-5806-0000-0000-000020204783 Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Wed, Feb 27, 2013 at 02:16:59AM +0100, Jan Kara wrote: > A CPU can be caught in console_unlock() for a long time (tens of seconds > are reported by our customers) when other CPUs are using printk heavily > and serial console makes printing slow. Despite serial console drivers > are calling touch_nmi_watchdog() this triggers softlockup warnings > because interrupts are effectively disabled for the whole time printing > takes place. Thus IPIs cannot be processed and other CPUs get stuck > spinning in calls like smp_call_function_many(). Also RCU eventually > starts reporting lockups. > > In my artifical testing I also managed to trigger a situation when disk > disappeared from the system apparently because commands to / from it > could not be delivered for long enough. This is why just silencing > watchdogs isn't a reliable solution to the problem. > > One part of fixing the issue is changing vprintk_emit() to call > console_unlock() with interrupts enabled (this isn't perfect as printk() > itself can be called with interrupts disabled but it improves the > situation in lots of cases). Another part is limiting the time we spend > in console_unlock() printing loop to watchdog_thresh() / 4. Then we > release console_sem and wait for watchdog_thresh() / 4 to give a chance > to other printk() users to get the semaphore and start printing. If > printk() was called with interrupts enabled, it also gives CPU a chance > to process blocked interrupts. Then we recheck if there's still anything > to print, try to grab console_sem again and if we succeed, we go on with > printing. Hello, Jan, If a CPU remains in your loop where you invoke rcu_cpu_stall_reset(), there is no indication of the stall. Now, I do understand why having RCU add a CPU stall warning into the mix would be very bad, but it would be good to have some sort of indication that there is stalling. Perhaps a counter visible in sysfs? Whatever it is, at least some way of diagnosing the stall condition would be very good. Thanx, Paul > Signed-off-by: Jan Kara > --- > include/linux/nmi.h | 18 ++++++++++++++++++ > kernel/printk.c | 44 ++++++++++++++++++++++++++++++++++++++------ > 2 files changed, 56 insertions(+), 6 deletions(-) > > Andrew, I was playing with this problem some more and this patch is the least > scary fix I found ;). It may slow down printing in heavy-printk cases though. > What do you think? > > diff --git a/include/linux/nmi.h b/include/linux/nmi.h > index db50840..ce32959 100644 > --- a/include/linux/nmi.h > +++ b/include/linux/nmi.h > @@ -51,6 +51,24 @@ extern int watchdog_thresh; > struct ctl_table; > extern int proc_dowatchdog(struct ctl_table *, int , > void __user *, size_t *, loff_t *); > +/* > + * Return the maximum number of nanosecond for which interrupts may be disabled > + * on the current CPU > + */ > +static inline u64 max_interrupt_disabled_duration(void) > +{ > + /* > + * We give us some headroom because timers need time to fire before the > + * watchdog period expires. > + */ > + return ((u64)watchdog_thresh) * NSEC_PER_SEC / 2; > +} > +#else > +static inline u64 max_interrupt_disabled_duration(void) > +{ > + /* About the value we'd get with the default watchdog setting */ > + return 5ULL * NSEC_PER_SEC; > +} > #endif > > #endif > diff --git a/kernel/printk.c b/kernel/printk.c > index 0b31715..2f4d9a4 100644 > --- a/kernel/printk.c > +++ b/kernel/printk.c > @@ -1522,7 +1522,8 @@ asmlinkage int vprintk_emit(int facility, int level, > */ > if (!oops_in_progress && !lockdep_recursing(current)) { > recursion_bug = 1; > - goto out_restore_irqs; > + local_irq_restore(flags); > + return printed_len; > } > zap_locks(); > } > @@ -1616,17 +1617,19 @@ asmlinkage int vprintk_emit(int facility, int level, > /* > * Try to acquire and then immediately release the console semaphore. > * The release will print out buffers and wake up /dev/kmsg and syslog() > - * users. > + * users. We call console_unlock() with interrupts enabled if possible > + * since printing can take a long time. > * > * The console_trylock_for_printk() function will release 'logbuf_lock' > * regardless of whether it actually gets the console semaphore or not. > */ > - if (console_trylock_for_printk(this_cpu)) > + if (console_trylock_for_printk(this_cpu)) { > + local_irq_restore(flags); > console_unlock(); > + } else > + local_irq_restore(flags); > > lockdep_on(); > -out_restore_irqs: > - local_irq_restore(flags); > > return printed_len; > } > @@ -2046,6 +2049,8 @@ void console_unlock(void) > unsigned long flags; > bool wake_klogd = false; > bool retry; > + u64 end_time, now; > + int cur_cpu; > > if (console_suspended) { > up(&console_sem); > @@ -2053,6 +2058,15 @@ void console_unlock(void) > } > > console_may_schedule = 0; > + preempt_disable(); > + cur_cpu = smp_processor_id(); > + /* > + * We give us some headroom because we check the time only after > + * printing the whole message > + */ > + end_time = cpu_clock(cur_cpu) + max_interrupt_disabled_duration() / 2; > + preempt_enable(); > + > > /* flush buffered message fragment immediately to console */ > console_cont_flush(text, sizeof(text)); > @@ -2075,7 +2089,8 @@ again: > console_prev = 0; > } > skip: > - if (console_seq == log_next_seq) > + now = sched_clock_cpu(cur_cpu); > + if (console_seq == log_next_seq || now > end_time) > break; > > msg = log_from_idx(console_idx); > @@ -2121,6 +2136,23 @@ skip: > up(&console_sem); > > /* > + * If the printing took too long, wait a bit to give other CPUs a > + * chance to take console_sem or at least provide some time for > + * interrupts to be processed (if we are lucky enough and they are > + * enabled at this point). > + */ > + if (now > end_time) { > + /* > + * We won't reach RCU quiescent state anytime soon, silence > + * the warnings. > + */ > + local_irq_save(flags); > + rcu_cpu_stall_reset(); > + local_irq_restore(flags); > + ndelay(max_interrupt_disabled_duration() / 2); > + } > + > + /* > * Someone could have filled up the buffer again, so re-check if there's > * something to flush. In case we cannot trylock the console_sem again, > * there's a new owner and the console_unlock() from them will do the > -- > 1.7.1 > > -- > To unsubscribe from this list: send the line "unsubscribe linux-kernel" in > the body of a message to majordomo@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html > Please read the FAQ at http://www.tux.org/lkml/ >