From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1752488AbcGARWu (ORCPT ); Fri, 1 Jul 2016 13:22:50 -0400 Received: from mail-oi0-f66.google.com ([209.85.218.66]:34189 "EHLO mail-oi0-f66.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751953AbcGARWs (ORCPT ); Fri, 1 Jul 2016 13:22:48 -0400 Date: Fri, 1 Jul 2016 12:22:32 -0500 From: Tejun Heo To: Viresh Kumar Cc: Greg Kroah-Hartman , Linux Kernel Mailing List , vlevenetz@mm-sol.com, vaibhav.hiremath@linaro.org, alex.elder@linaro.org, johan@kernel.org Subject: Re: [Query] Preemption (hogging) of the work handler Message-ID: <20160701172232.GD28719@htj.duckdns.org> References: <20160701165959.GR12473@ubuntu> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20160701165959.GR12473@ubuntu> User-Agent: Mutt/1.6.1 (2016-04-27) Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Hello, Viresh. On Fri, Jul 01, 2016 at 09:59:59AM -0700, Viresh Kumar wrote: > The system watchdog uses a delayed-work (1 second) for petting the > watchdog (resetting its counter) and if the work doesn't reset the > counters in time (another 1 second), the watchdog resets the system. > > Petting-time: 1 second > Watchdog Reset-time: 2 seconds > > The wq is allocated with: > wdog_wq = alloc_workqueue("wdog", WQ_HIGHPRI, 0); You probably want WQ_MEM_RECLAIM there to guarantee that it can run quickly under memory pressure. In reality, this shouldn't matter too much as there aren't many competing highpri work items and thus it's likely that there are ready highpri workers waiting for work items. > The watchdog's work-handler looks like this: > > static void pet_watchdog_work(struct work_struct *work) > { > > ... > > pet_watchdog(); //Reset its counters > > /* CONFIG_HZ=300, queuing for 1 second */ > queue_delayed_work(wdog_wq, &wdog_dwork, 300); > } > > kernel: 3.10 (Yeah, you can rant me for that, but its not something I > can decide on :) Android? > Symptoms: > > - The watchdog reboots the system sometimes. It is more reproducible > in cases where an (out-of-tree) bus enumerated over USB is suddenly > disconnected, which leads to removal of lots of kernel devices on > that bus and a lot of print messages as well, due to failures for > sending any more data for those devices.. I see. > Observations: > > I tried to get more into it and found this.. > > - The timer used by the delayed work fires at the time it was > programmed for (checked timer->expires with value of jiffies) and > the work-handler gets a chance to run and reset the counters pretty > quickly after that. Hmmm... > - But somehow, the timer isn't programmed for the right time. > > - Something is happening between the time the work-handler starts > running and we read jiffies from the add_timer() function which gets > called from within the queue_delayed_work(). > > - For example, if the value of jiffies in the pet_watchdog_work() > handler (before calling queue_delayed_work()) is say 1000000, then > the value of jiffies after the call to queue_delayed_work() has > returned becomes 1000310. i.e. it sometimes increases by a value of > over 300, which is 1 second in our setup. I have seen this delta to > vary from 50 to 350. If it crosses 300, the watchdog resets the > system (as it was programmed for 2 seconds). That's weird. Once the work item starts executing, there isn't much which can delay it. queue_delayed_work() doesn't even take any lock before reading jiffies. In the failing cases, what's jiffies right before and after pet_watchdog_work()? Can that take long? > So, we aren't able to queue the next timer in time and that causes all > these problems. I haven't concluded on why is that so.. > > Questions: > > - I hope that the wq handler can be preempted, but can it be this bad? It doesn't get preempted more than any other kthread w/ -20 nice value, so in most systems it shouldn't get preempted at all. > - Is it fine to use the wq-handler for petting the watchdog? Or should > that only be done with help of interrupt-handlers? It's absoultely fine but you'd prolly want WQ_HIGHPRI | WQ_MEM_RECLAIM. > - Any other clues you can give which can help us figure out what's > going on? > > Thanks in advance and sorry to bother you :) I'd watch sched TPs and see what actually is going on. The described scenario should work completely fine. Thanks. -- tejun