From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1759403AbcGKPpI (ORCPT ); Mon, 11 Jul 2016 11:45:08 -0400 Received: from mail-pa0-f41.google.com ([209.85.220.41]:36827 "EHLO mail-pa0-f41.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1755292AbcGKPpF (ORCPT ); Mon, 11 Jul 2016 11:45:05 -0400 Date: Tue, 12 Jul 2016 00:44:38 +0900 From: Sergey Senozhatsky To: Jan Kara , Viresh Kumar Cc: Tejun Heo , Greg Kroah-Hartman , Linux Kernel Mailing List , vlevenetz@mm-sol.com, vaibhav.hiremath@linaro.org, alex.elder@linaro.org, johan@kernel.org, akpm@linux-foundation.org, rostedt@goodmis.org, Sergey Senozhatsky , Sergey Senozhatsky Subject: Re: [Query] Preemption (hogging) of the work handler Message-ID: <20160711154438.GA528@swordfish> References: <20160701165959.GR12473@ubuntu> <20160701172232.GD28719@htj.duckdns.org> <20160706182842.GS2671@ubuntu> <20160711102603.GI12410@quack2.suse.cz> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20160711102603.GI12410@quack2.suse.cz> User-Agent: Mutt/1.6.2 (2016-07-01) Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Hello, Thanks for Cc-ing. I'm attending an internal 2-days training now, so I'm a bit slow at answering emails, sorry. On (07/11/16 12:26), Jan Kara wrote: [..] > > These print messages continue from 2994.918 to 2996.268 (1.35 seconds) > > and they hog the work-handler for that long, which results in watchdog > > reboot in our setup. The 3.10 kernel implementation of the printk > > looks like this (if I am not wrong): > > > > local_irq_save(); > > flush-console-buffer(); //console_unlock() > > local_irq_restore(); > > > > So, the current CPU will try to print all the messages from the > > buffer, before enabling the interrupts again on the local CPU and so I > > don't see the hrtimer fire at all for almost a second. > > right. apart from cases when the existing console_unlock() behaviour can simply "block" a process to flush the log_buf to slow serial consoles (regardless the process execution context) and make the system less responsive, I have around ~10 absolutely different scenarios on my list that may cause soft/hard lockups, rcu stalls, oom-s, etc. and console_unlock() is the root cause there. the simplest ones involve heavy printk() usage, the trickier ones do not necessarily have anything that is abusing printk(): a moderate printk() pressure coming from other CPUs on the system and more or less active tty -> UART can do the trick, because uart interrupt service routine and call_console_drivers()->write() have to compete for the same uart port spin_lock. soft lockups are probably the most common problems, though, it's not all that easy to catch, because watchdog does not ring the bell straight after preempt_enable(), but from hrtimer interrupt, that happens approx every 4 seconds. by this time CPU can be somewhere far away from console_unlock(). I had an idea of doing watchdog soft lockup check from preempt_enable(), when it brings preempt_count down to zero, but not sure I can recall how well did it go. > > I tried looking at if something related to this changed between 3.10 > > and mainline, and found few patches at least. One of the important > > ones is: > > > > commit 5874af2003b1 ("printk: enable interrupts before calling > > console_trylock_for_printk()") > > > > I wasn't able to backport it cleanly to 3.10 yet to see it makes thing > > work better though. But it looks like it was targeting similar > > problems. > Yes. We have similar problems as you observe on machines when they do a lot > of printing (usually due to device discovery or similar reasons). The > problem is not fully solved even upstream as Andrew is reluctant to merge > the patches. Sergey (added to CC) has the latest version of the series [1]. > If you are interested, I can send you the patches for 3.12 kernel which we > carry in SLES kernels and which fixes the issue for us. It is significanly > different from current upstream version but it works good enough for us. yes, an alternative link /* lkml.org is pretty unreliable sometimes*/ is: http://marc.info/?l=linux-kernel&m=146314209118602 I don't have a backport to 3.10, sorry. I had it some time ago (not the current version, tho), but I think I lost it by now, don't have to deal with 3.10 anymore. I'll re-spin the series in a day or two, I think. A rebased version (against next-20160711), basically, has only that KERN_CONT patch as part of 0001 now: http://marc.info/?l=linux-kernel&m=146717692431893 hopefully it will re-fresh the discussion and I'll be able to polish the series so Andrew will be less sceptical about the whole thing. -ss